针对长江新链网络为航运业务提供网络资源时能耗较高的问题进行了研究,采用了网络资源和算力资源统一建模为底层网络资源的方法,构建了网络切片环境下网络资源管理模型。根据长江新链网络切片资源的特点,构建了最小化能耗的目标函数。...针对长江新链网络为航运业务提供网络资源时能耗较高的问题进行了研究,采用了网络资源和算力资源统一建模为底层网络资源的方法,构建了网络切片环境下网络资源管理模型。根据长江新链网络切片资源的特点,构建了最小化能耗的目标函数。通过构建马尔可夫决策模型,提出了基于双深度Q-Network(Double Deep Q-Network,DDQN)的长江新链网络切片资源分配算法。通过实验与相关算法进行了分析,验证了文中算法选择能耗低的网络资源为航运业务提供资源,降低了长江新链网络资源约13.7%的能耗,提升了约10.2%的资源分配成功率。展开更多
针对生鲜农产品零售商库存成本控制问题,将该问题转换为马尔可夫决策过程,引入三参数Weibull函数,描述生鲜农产品的损腐特征,并考虑过期、损腐、缺货、订货和持有等成本,从供应链视角建立生鲜农产品库存成本控制模型,使用深度强化学习...针对生鲜农产品零售商库存成本控制问题,将该问题转换为马尔可夫决策过程,引入三参数Weibull函数,描述生鲜农产品的损腐特征,并考虑过期、损腐、缺货、订货和持有等成本,从供应链视角建立生鲜农产品库存成本控制模型,使用深度强化学习中深度双Q网络(Double Deep Q Network,DDQN)优化订货,以控制库存总成本。实验结果表明,相比单周期随机型库存成本控制模型和固定订货量库存成本控制模型,DDQN模型的总成本分别降低约6%和11%,具有实际应用价值。展开更多
针对无人机(UAV)空战环境信息复杂、对抗性强所导致的敌机机动策略难以预测,以及作战胜率不高的问题,设计了一种引导Minimax-DDQN(Minimax-Double Deep Q-Network)算法。首先,在Minimax决策方法的基础上提出了一种引导式策略探索机制;然...针对无人机(UAV)空战环境信息复杂、对抗性强所导致的敌机机动策略难以预测,以及作战胜率不高的问题,设计了一种引导Minimax-DDQN(Minimax-Double Deep Q-Network)算法。首先,在Minimax决策方法的基础上提出了一种引导式策略探索机制;然后,结合引导Minimax策略,以提升Q网络更新效率为出发点设计了一种DDQN(Double Deep Q-Network)算法;最后,提出进阶式三阶段的网络训练方法,通过不同决策模型间的对抗训练,获取更为优化的决策模型。实验结果表明,相较于Minimax-DQN(Minimax-DQN)、Minimax-DDQN等算法,所提算法追击直线目标的成功率提升了14%~60%,并且与DDQN算法的对抗胜率不低于60%。可见,与DDQN、Minimax-DDQN等算法相比,所提算法在高对抗的作战环境中具有更强的决策能力,适应性更好。展开更多
In the face of the increasingly severe Botnet problem on the Internet,how to effectively detect Botnet traffic in realtime has become a critical problem.Although the existing deepQnetwork(DQN)algorithminDeep reinforce...In the face of the increasingly severe Botnet problem on the Internet,how to effectively detect Botnet traffic in realtime has become a critical problem.Although the existing deepQnetwork(DQN)algorithminDeep reinforcement learning can solve the problem of real-time updating,its prediction results are always higher than the actual results.In Botnet traffic detection,although it performs well in the training set,the accuracy rate of predicting traffic is as high as%;however,in the test set,its accuracy has declined,and it is impossible to adjust its prediction strategy on time based on new data samples.However,in the new dataset,its accuracy has declined significantly.Therefore,this paper proposes a Botnet traffic detection system based on double-layer DQN(DDQN).Two Q-values are designed to adjust the model in policy and action,respectively,to achieve real-time model updates and improve the universality and robustness of the model under different data sets.Experiments show that compared with the DQN model,when using DDQN,the Q-value is not too high,and the detectionmodel has improved the accuracy and precision of Botnet traffic.Moreover,when using Botnet data sets other than the test set,the accuracy and precision of theDDQNmodel are still higher than DQN.展开更多
针对城市空对地模型中无人机与地面用户通信视线连接受阻的问题,提出了基于深度强化学习的无人机通信速率优化方案。利用智能反射面(reconfigurable intelligent surface,RIS)辅助无人机通信,采用双深度Q网络(double deep Q-Learning,DD...针对城市空对地模型中无人机与地面用户通信视线连接受阻的问题,提出了基于深度强化学习的无人机通信速率优化方案。利用智能反射面(reconfigurable intelligent surface,RIS)辅助无人机通信,采用双深度Q网络(double deep Q-Learning,DDQN)算法联合RIS相移和无人机的3D轨迹优化无人机的通信速率,在自建仿真平台上对该方案进行验证。结果表明:与RIS随机相移的DDQN方案、未部署RIS的DDQN方案及RIS相移优化的决斗深度Q网络方案相比,该方案在无人机飞行周期内的平均吞吐量,分别提高了38.61%、30.03%、53.97%。展开更多
文摘针对长江新链网络为航运业务提供网络资源时能耗较高的问题进行了研究,采用了网络资源和算力资源统一建模为底层网络资源的方法,构建了网络切片环境下网络资源管理模型。根据长江新链网络切片资源的特点,构建了最小化能耗的目标函数。通过构建马尔可夫决策模型,提出了基于双深度Q-Network(Double Deep Q-Network,DDQN)的长江新链网络切片资源分配算法。通过实验与相关算法进行了分析,验证了文中算法选择能耗低的网络资源为航运业务提供资源,降低了长江新链网络资源约13.7%的能耗,提升了约10.2%的资源分配成功率。
文摘针对生鲜农产品零售商库存成本控制问题,将该问题转换为马尔可夫决策过程,引入三参数Weibull函数,描述生鲜农产品的损腐特征,并考虑过期、损腐、缺货、订货和持有等成本,从供应链视角建立生鲜农产品库存成本控制模型,使用深度强化学习中深度双Q网络(Double Deep Q Network,DDQN)优化订货,以控制库存总成本。实验结果表明,相比单周期随机型库存成本控制模型和固定订货量库存成本控制模型,DDQN模型的总成本分别降低约6%和11%,具有实际应用价值。
文摘针对无人机(UAV)空战环境信息复杂、对抗性强所导致的敌机机动策略难以预测,以及作战胜率不高的问题,设计了一种引导Minimax-DDQN(Minimax-Double Deep Q-Network)算法。首先,在Minimax决策方法的基础上提出了一种引导式策略探索机制;然后,结合引导Minimax策略,以提升Q网络更新效率为出发点设计了一种DDQN(Double Deep Q-Network)算法;最后,提出进阶式三阶段的网络训练方法,通过不同决策模型间的对抗训练,获取更为优化的决策模型。实验结果表明,相较于Minimax-DQN(Minimax-DQN)、Minimax-DDQN等算法,所提算法追击直线目标的成功率提升了14%~60%,并且与DDQN算法的对抗胜率不低于60%。可见,与DDQN、Minimax-DDQN等算法相比,所提算法在高对抗的作战环境中具有更强的决策能力,适应性更好。
基金the Liaoning Province Applied Basic Research Program,2023JH2/101600038.
文摘In the face of the increasingly severe Botnet problem on the Internet,how to effectively detect Botnet traffic in realtime has become a critical problem.Although the existing deepQnetwork(DQN)algorithminDeep reinforcement learning can solve the problem of real-time updating,its prediction results are always higher than the actual results.In Botnet traffic detection,although it performs well in the training set,the accuracy rate of predicting traffic is as high as%;however,in the test set,its accuracy has declined,and it is impossible to adjust its prediction strategy on time based on new data samples.However,in the new dataset,its accuracy has declined significantly.Therefore,this paper proposes a Botnet traffic detection system based on double-layer DQN(DDQN).Two Q-values are designed to adjust the model in policy and action,respectively,to achieve real-time model updates and improve the universality and robustness of the model under different data sets.Experiments show that compared with the DQN model,when using DDQN,the Q-value is not too high,and the detectionmodel has improved the accuracy and precision of Botnet traffic.Moreover,when using Botnet data sets other than the test set,the accuracy and precision of theDDQNmodel are still higher than DQN.
文摘针对城市空对地模型中无人机与地面用户通信视线连接受阻的问题,提出了基于深度强化学习的无人机通信速率优化方案。利用智能反射面(reconfigurable intelligent surface,RIS)辅助无人机通信,采用双深度Q网络(double deep Q-Learning,DDQN)算法联合RIS相移和无人机的3D轨迹优化无人机的通信速率,在自建仿真平台上对该方案进行验证。结果表明:与RIS随机相移的DDQN方案、未部署RIS的DDQN方案及RIS相移优化的决斗深度Q网络方案相比,该方案在无人机飞行周期内的平均吞吐量,分别提高了38.61%、30.03%、53.97%。