DOI: 10.3390/app16126271 ISSN: 2076-3417

Risk-Sensitive Distributional Proximal Policy Optimization for Safe Highway Lane-Change Decision-Making

Qing Ye, Rongliang Zhou, Jiakun Huang, Yaxuan Liu, Xiaolin Song

Decision-making is a critical module for intelligent vehicles to achieve safe and efficient autonomous driving. However, most existing reinforcement learning-based decision-making methods optimize policies by maximizing the expected return, which may inadequately account for low-probability but high-cost safety risks in complex traffic interactions. To address this issue, this paper proposes a Risk-Sensitive Distributional Proximal Policy Optimization (PPO) method, termed Risk-Sensitive Distributional Proximal Policy Optimization (RSDPPO), for highway lane-changing decision-making. Within the PPO framework, a distributional state-value function is introduced to model the return distribution under the current policy, and a Wang distortion-based risk measure is further incorporated to construct a risk-sensitive advantage function. In this way, risk information contained in the return distribution can be propagated into the policy gradient update, guiding the learned policy to avoid high-risk driving behaviors while maintaining training stability. Simulation experiments are conducted in a highway lane-changing scenario with heterogeneous surrounding vehicles. The results show that, under medium-density traffic, the proposed method outperforms representative baseline algorithms in cumulative reward, success rate, and safety reward. Further evaluation under higher-density traffic demonstrates that RSDPPO maintains better overall performance, indicating stronger adaptability to denser traffic conditions. Ablation studies further show that risk-averse distortion improves the balance between safety and efficiency by increasing safety margins during car-following and lane-changing maneuvers. These results indicate that RSDPPO provides an effective risk-sensitive policy optimization framework for safety-oriented highway lane-changing decision-making.

More from our Archive