representation capability of deep learning(DL) and the optimal decision making and control capability of reinforcement learning(RL), is a good approach to address this problem. Traffic environment is built up by combi...representation capability of deep learning(DL) and the optimal decision making and control capability of reinforcement learning(RL), is a good approach to address this problem. Traffic environment is built up by combining intelligent driver model(IDM) and lane-change model as behavioral model for vehicles. To increase the stochastic of the established traffic environment, tricks such as defining a speed distribution with cutoff for traffic cars and using various politeness factors to represent distinguished lane-change style, are taken. For training an artificial agent to achieve successful strategies that lead to the greatest long-term rewards and sophisticated maneuver, deep deterministic policy gradient(DDPG) algorithm is deployed for learning. Reward function is designed to get a trade-off between the vehicle speed, stability and driving safety. Results show that the proposed approach can achieve good autonomous maneuvering in a scenario of complex traffic behavior through interaction with the environment.展开更多
文摘representation capability of deep learning(DL) and the optimal decision making and control capability of reinforcement learning(RL), is a good approach to address this problem. Traffic environment is built up by combining intelligent driver model(IDM) and lane-change model as behavioral model for vehicles. To increase the stochastic of the established traffic environment, tricks such as defining a speed distribution with cutoff for traffic cars and using various politeness factors to represent distinguished lane-change style, are taken. For training an artificial agent to achieve successful strategies that lead to the greatest long-term rewards and sophisticated maneuver, deep deterministic policy gradient(DDPG) algorithm is deployed for learning. Reward function is designed to get a trade-off between the vehicle speed, stability and driving safety. Results show that the proposed approach can achieve good autonomous maneuvering in a scenario of complex traffic behavior through interaction with the environment.