近年来深度强化学习作为一种高效可靠的机器学习方法被广泛应用在交通信号控制领域。目前,现有交通信号配时方法通常忽略了特殊车辆(例如救护车、消防车等)的优先通行;此外,基于传统深度强化学习的信号配时方法优化目标较为单一,导致其...近年来深度强化学习作为一种高效可靠的机器学习方法被广泛应用在交通信号控制领域。目前,现有交通信号配时方法通常忽略了特殊车辆(例如救护车、消防车等)的优先通行;此外,基于传统深度强化学习的信号配时方法优化目标较为单一,导致其在复杂交通场景中性能不佳。针对上述问题,基于Double DQN提出一种融合特殊车辆优先通行的双模式多目标信号配时方法(Dual-mode Multi-objective signal timing method based on Double DQN,DMDD),以提高不同交通场景下路口的通行效率。该方法首先基于路口的饱和状态选择信号控制模式,特殊车辆在紧急控制模式下被赋予更高的通行权重,有利于其更快通过路口;接着针对等待时长、队列长度和CO 2排放量3个指标分别设计神经网络进行奖励计算;最后利用Double DQN进行最优信号相位的选择,通过灵活切换信号相位以提升通行效率。基于SUMO的实验结果表明,DMDD与对比方法相比能有效缩短路口处特殊车辆的等待时长、队列长度和CO 2排放量,特殊车辆能够更快通过路口,有效地提高了通行效率。展开更多
Multi-energy microgrids(MEMG)play an important role in promoting carbon neutrality and achieving sustainable development.This study investigates an effective energy management strategy(EMS)for MEMG.First,an energy man...Multi-energy microgrids(MEMG)play an important role in promoting carbon neutrality and achieving sustainable development.This study investigates an effective energy management strategy(EMS)for MEMG.First,an energy management system model that allows for intra-microgrid energy conversion is developed,and the corresponding Markov decision process(MDP)problem is formulated.Subsequently,an improved double deep Q network(iDDQN)algorithm is proposed to enhance the exploration ability by modifying the calculation of the Q value,and a prioritized experience replay(PER)is introduced into the iDDQN to improve the training speed and effectiveness.Finally,taking advantage of the federated learning(FL)and iDDQN algorithms,a federated iDDQN is proposed to design an MEMG energy management strategy to enable each microgrid to share its experiences in the form of local neural network(NN)parameters with the federation layer,thus ensuring the privacy and security of data.The simulation results validate the superior performance of the proposed energy management strategy in minimizing the economic costs of the MEMG while reducing CO_2 emissions and protecting data privacy.展开更多
多年来深度强化学习算法与智能交通系统结合的方法在交通信号控制领域取得了突出成效。然而,仅依靠深度强化学习算法仍然无法弥补卷积神经网络提取特征的缺陷,从而影响智能体的整体策略输出。针对现存的特征提取问题,在深度双Q网络(doub...多年来深度强化学习算法与智能交通系统结合的方法在交通信号控制领域取得了突出成效。然而,仅依靠深度强化学习算法仍然无法弥补卷积神经网络提取特征的缺陷,从而影响智能体的整体策略输出。针对现存的特征提取问题,在深度双Q网络(double deep Q network,double DQN)模型基础上提出了一种基于注意力机制的深度强化学习模型进行交通信号控制。将压缩激活网络(squeeze and excitation networks,SENet)注意力机制添加到三维卷积神经网络中,通过建模特征图通道间的相互依赖来增强卷积神经网络的表征质量,从而输出最优的交通信号控制动作。实验结果表明,算法表现出了良好的交通信号控制效果,且具有显著的稳定性。展开更多
Edge computing nodes undertake an increasing number of tasks with the rise of business density.Therefore,how to efficiently allocate large-scale and dynamic workloads to edge computing resources has become a critical ...Edge computing nodes undertake an increasing number of tasks with the rise of business density.Therefore,how to efficiently allocate large-scale and dynamic workloads to edge computing resources has become a critical challenge.This study proposes an edge task scheduling approach based on an improved Double Deep Q Network(DQN),which is adopted to separate the calculations of target Q values and the selection of the action in two networks.A new reward function is designed,and a control unit is added to the experience replay unit of the agent.The management of experience data are also modified to fully utilize its value and improve learning efficiency.Reinforcement learning agents usually learn from an ignorant state,which is inefficient.As such,this study proposes a novel particle swarm optimization algorithm with an improved fitness function,which can generate optimal solutions for task scheduling.These optimized solutions are provided for the agent to pre-train network parameters to obtain a better cognition level.The proposed algorithm is compared with six other methods in simulation experiments.Results show that the proposed algorithm outperforms other benchmark methods regarding makespan.展开更多
文摘近年来深度强化学习作为一种高效可靠的机器学习方法被广泛应用在交通信号控制领域。目前,现有交通信号配时方法通常忽略了特殊车辆(例如救护车、消防车等)的优先通行;此外,基于传统深度强化学习的信号配时方法优化目标较为单一,导致其在复杂交通场景中性能不佳。针对上述问题,基于Double DQN提出一种融合特殊车辆优先通行的双模式多目标信号配时方法(Dual-mode Multi-objective signal timing method based on Double DQN,DMDD),以提高不同交通场景下路口的通行效率。该方法首先基于路口的饱和状态选择信号控制模式,特殊车辆在紧急控制模式下被赋予更高的通行权重,有利于其更快通过路口;接着针对等待时长、队列长度和CO 2排放量3个指标分别设计神经网络进行奖励计算;最后利用Double DQN进行最优信号相位的选择,通过灵活切换信号相位以提升通行效率。基于SUMO的实验结果表明,DMDD与对比方法相比能有效缩短路口处特殊车辆的等待时长、队列长度和CO 2排放量,特殊车辆能够更快通过路口,有效地提高了通行效率。
基金supported by the Research and Development of Key Technologies of the Regional Energy Internet based on Multi-Energy Complementary and Collaborative Optimization(BE2020081)。
文摘Multi-energy microgrids(MEMG)play an important role in promoting carbon neutrality and achieving sustainable development.This study investigates an effective energy management strategy(EMS)for MEMG.First,an energy management system model that allows for intra-microgrid energy conversion is developed,and the corresponding Markov decision process(MDP)problem is formulated.Subsequently,an improved double deep Q network(iDDQN)algorithm is proposed to enhance the exploration ability by modifying the calculation of the Q value,and a prioritized experience replay(PER)is introduced into the iDDQN to improve the training speed and effectiveness.Finally,taking advantage of the federated learning(FL)and iDDQN algorithms,a federated iDDQN is proposed to design an MEMG energy management strategy to enable each microgrid to share its experiences in the form of local neural network(NN)parameters with the federation layer,thus ensuring the privacy and security of data.The simulation results validate the superior performance of the proposed energy management strategy in minimizing the economic costs of the MEMG while reducing CO_2 emissions and protecting data privacy.
文摘多年来深度强化学习算法与智能交通系统结合的方法在交通信号控制领域取得了突出成效。然而,仅依靠深度强化学习算法仍然无法弥补卷积神经网络提取特征的缺陷,从而影响智能体的整体策略输出。针对现存的特征提取问题,在深度双Q网络(double deep Q network,double DQN)模型基础上提出了一种基于注意力机制的深度强化学习模型进行交通信号控制。将压缩激活网络(squeeze and excitation networks,SENet)注意力机制添加到三维卷积神经网络中,通过建模特征图通道间的相互依赖来增强卷积神经网络的表征质量,从而输出最优的交通信号控制动作。实验结果表明,算法表现出了良好的交通信号控制效果,且具有显著的稳定性。
基金supported by the National Key Research and Development Program of China(No.2021YFE0116900)National Natural Science Foundation of China(Nos.42275157,62002276,and 41975142)Major Program of the National Social Science Fund of China(No.17ZDA092).
文摘Edge computing nodes undertake an increasing number of tasks with the rise of business density.Therefore,how to efficiently allocate large-scale and dynamic workloads to edge computing resources has become a critical challenge.This study proposes an edge task scheduling approach based on an improved Double Deep Q Network(DQN),which is adopted to separate the calculations of target Q values and the selection of the action in two networks.A new reward function is designed,and a control unit is added to the experience replay unit of the agent.The management of experience data are also modified to fully utilize its value and improve learning efficiency.Reinforcement learning agents usually learn from an ignorant state,which is inefficient.As such,this study proposes a novel particle swarm optimization algorithm with an improved fitness function,which can generate optimal solutions for task scheduling.These optimized solutions are provided for the agent to pre-train network parameters to obtain a better cognition level.The proposed algorithm is compared with six other methods in simulation experiments.Results show that the proposed algorithm outperforms other benchmark methods regarding makespan.