Abstract
The diversity of network forms and services poses challenges to the TCP protocol in achieving good performance. The current XQUIC implementation of the QUIC protocol still adopts TCP’s heuristic congestion control mechanisms, resulting in limited performance gains. In recent years, reinforcement learning-based congestion control has emerged as an effective alternative to traditional strategies, but existing algorithms are not optimized for dynamic network characteristics. In this paper, we propose a deep reinforcement learning-based congestion control algorithm, Dynamic Network Congestion Control for QUIC Based on PPO (DNCCQ-PPO). To address the heterogeneity of dynamic network training environments, we introduce a novel sampling interaction mechanism, action space, and reward function, and propose an asynchronous distributed training scheme. Additionally, we develop a generalized reinforcement learning framework for congestion control algorithm development that supports XQUIC, and verify the performance of DNCCQ-PPO within this framework. Experimental results demonstrate the algorithm’s fast convergence and excellent training performance. In performance tests, DNCCQ-PPO achieves throughput comparable to that of CUBIC while reducing latency by 54.78%. In multi-stream fairness tests, it outperforms several mainstream algorithms. In satellite network simulations, DNCCQ-PPO maintains high throughput while reducing latency by 69.58% and 72.77% compared to CUBIC and PCC, respectively.