期刊文献+
共找到157篇文章
< 1 2 8 >
每页显示 20 50 100
Transformer-Aided Deep Double Dueling Spatial-Temporal Q-Network for Spatial Crowdsourcing Analysis
1
作者 Yu Li Mingxiao Li +2 位作者 Dongyang Ou Junjie Guo Fangyuan Pan 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期893-909,共17页
With the rapid development ofmobile Internet,spatial crowdsourcing has becomemore andmore popular.Spatial crowdsourcing consists of many different types of applications,such as spatial crowd-sensing services.In terms ... With the rapid development ofmobile Internet,spatial crowdsourcing has becomemore andmore popular.Spatial crowdsourcing consists of many different types of applications,such as spatial crowd-sensing services.In terms of spatial crowd-sensing,it collects and analyzes traffic sensing data from clients like vehicles and traffic lights to construct intelligent traffic prediction models.Besides collecting sensing data,spatial crowdsourcing also includes spatial delivery services like DiDi and Uber.Appropriate task assignment and worker selection dominate the service quality for spatial crowdsourcing applications.Previous research conducted task assignments via traditional matching approaches or using simple network models.However,advanced mining methods are lacking to explore the relationship between workers,task publishers,and the spatio-temporal attributes in tasks.Therefore,in this paper,we propose a Deep Double Dueling Spatial-temporal Q Network(D3SQN)to adaptively learn the spatialtemporal relationship between task,task publishers,and workers in a dynamic environment to achieve optimal allocation.Specifically,D3SQNis revised through reinforcement learning by adding a spatial-temporal transformer that can estimate the expected state values and action advantages so as to improve the accuracy of task assignments.Extensive experiments are conducted over real data collected fromDiDi and ELM,and the simulation results verify the effectiveness of our proposed models. 展开更多
关键词 Historical behavior analysis spatial crowdsourcing deep double dueling q-networks
下载PDF
Reinforcement Learning with an Ensemble of Binary Action Deep Q-Networks
2
作者 A.M.Hafiz M.Hassaballah +2 位作者 Abdullah Alqahtani Shtwai Alsubai Mohamed Abdel Hameed 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期2651-2666,共16页
With the advent of Reinforcement Learning(RL)and its continuous progress,state-of-the-art RL systems have come up for many challenging and real-world tasks.Given the scope of this area,various techniques are found in ... With the advent of Reinforcement Learning(RL)and its continuous progress,state-of-the-art RL systems have come up for many challenging and real-world tasks.Given the scope of this area,various techniques are found in the literature.One such notable technique,Multiple Deep Q-Network(DQN)based RL systems use multiple DQN-based-entities,which learn together and communicate with each other.The learning has to be distributed wisely among all entities in such a scheme and the inter-entity communication protocol has to be carefully designed.As more complex DQNs come to the fore,the overall complexity of these multi-entity systems has increased many folds leading to issues like difficulty in training,need for high resources,more training time,and difficulty in fine-tuning leading to performance issues.Taking a cue from the parallel processing found in the nature and its efficacy,we propose a lightweight ensemble based approach for solving the core RL tasks.It uses multiple binary action DQNs having shared state and reward.The benefits of the proposed approach are overall simplicity,faster convergence and better performance compared to conventional DQN based approaches.The approach can potentially be extended to any type of DQN by forming its ensemble.Conducting extensive experimentation,promising results are obtained using the proposed ensemble approach on OpenAI Gym tasks,and Atari 2600 games as compared to recent techniques.The proposed approach gives a stateof-the-art score of 500 on the Cartpole-v1 task,259.2 on the LunarLander-v2 task,and state-of-the-art results on four out of five Atari 2600 games. 展开更多
关键词 deep q-networks ensemble learning reinforcement learning OpenAI Gym environments
下载PDF
一种基于DQN的去中心化优先级卸载策略
3
作者 张俊娜 李天泽 +1 位作者 赵晓焱 袁培燕 《计算机工程》 CAS CSCD 北大核心 2024年第9期235-245,共11页
边缘计算(EC)可在网络边缘为用户提供低延迟、高响应的服务。因此,资源利用率高、时延低的任务卸载策略成为研究的热门方向。但大部分现有的任务卸载研究是基于中心化的架构,通过中心化设施制定卸载策略并进行资源调度,容易受到单点故... 边缘计算(EC)可在网络边缘为用户提供低延迟、高响应的服务。因此,资源利用率高、时延低的任务卸载策略成为研究的热门方向。但大部分现有的任务卸载研究是基于中心化的架构,通过中心化设施制定卸载策略并进行资源调度,容易受到单点故障的影响,且会产生较多的能耗和较高的时延。针对以上问题,提出一种基于深度Q网络(DQN)的去中心化优先级(DP-DQN)卸载策略。首先,设置通信矩阵模拟现实中边缘服务器有限的通信状态;其次,通过对任务设定优先级,使任务可以在不同边缘服务器之间跳转,保证各边缘服务器均可以自主制定卸载策略,完成任务卸载的去中心化;最后,根据任务的跳转次数为任务分配更多的计算资源,提高资源利用效率和优化效果。为了验证所提策略的有效性,针对不同DQN下参数的收敛性能进行了研究对比,实验结果表明,在不同测试情景下,DP-DQN的性能均优于本地算法、完全贪婪算法和多目标任务卸载算法,性能可提升约11%~19%。 展开更多
关键词 边缘计算 任务卸载 资源分配 去中心化 优先级 深度Q网络
下载PDF
Convolutional Neural Network-Based Deep Q-Network (CNN-DQN) Resource Management in Cloud Radio Access Network 被引量:2
4
作者 Amjad Iqbal Mau-Luen Tham Yoong Choon Chang 《China Communications》 SCIE CSCD 2022年第10期129-142,共14页
The recent surge of mobile subscribers and user data traffic has accelerated the telecommunication sector towards the adoption of the fifth-generation (5G) mobile networks. Cloud radio access network (CRAN) is a promi... The recent surge of mobile subscribers and user data traffic has accelerated the telecommunication sector towards the adoption of the fifth-generation (5G) mobile networks. Cloud radio access network (CRAN) is a prominent framework in the 5G mobile network to meet the above requirements by deploying low-cost and intelligent multiple distributed antennas known as remote radio heads (RRHs). However, achieving the optimal resource allocation (RA) in CRAN using the traditional approach is still challenging due to the complex structure. In this paper, we introduce the convolutional neural network-based deep Q-network (CNN-DQN) to balance the energy consumption and guarantee the user quality of service (QoS) demand in downlink CRAN. We first formulate the Markov decision process (MDP) for energy efficiency (EE) and build up a 3-layer CNN to capture the environment feature as an input state space. We then use DQN to turn on/off the RRHs dynamically based on the user QoS demand and energy consumption in the CRAN. Finally, we solve the RA problem based on the user constraint and transmit power to guarantee the user QoS demand and maximize the EE with a minimum number of active RRHs. In the end, we conduct the simulation to compare our proposed scheme with nature DQN and the traditional approach. 展开更多
关键词 energy efficiency(EE) markov decision process(MDP) convolutional neural network(CNN) cloud RAN deep q-network(dqn)
下载PDF
无人驾驶中运用DQN进行障碍物分类的避障方法
5
作者 刘航博 马礼 +2 位作者 李阳 马东超 傅颖勋 《计算机工程》 CAS CSCD 北大核心 2024年第11期380-389,共10页
安全是无人驾驶汽车需要考虑的首要因素,而避障问题是解决驾驶安全最有效的手段。基于学习的避障方法因其能够从环境中学习并直接从感知中做出决策的能力而受到研究者的关注。深度Q网络(DQN)作为一种流行的强化学习方法,在无人驾驶避障... 安全是无人驾驶汽车需要考虑的首要因素,而避障问题是解决驾驶安全最有效的手段。基于学习的避障方法因其能够从环境中学习并直接从感知中做出决策的能力而受到研究者的关注。深度Q网络(DQN)作为一种流行的强化学习方法,在无人驾驶避障领域取得了很大的进展,但这些方法未考虑障碍物类型对避障策略的影响。基于对障碍物的准确分类提出一种Classification Security DQN(CSDQN)的车辆行驶决策框架。根据障碍物的不同类型以及环境信息给出具有更高安全性的无人驾驶决策,达到提高无人驾驶安全性的目的。首先对检测到的障碍物根据障碍物的安全性等级进行分类,然后根据不同类型障碍物提出安全评估函数,利用位置的不确定性和基于距离的安全度量来评估安全性,接着CSDQN决策框架利用障碍物类型、相对位置信息以及安全评估函数进行不断迭代优化获得最终模型。仿真结果表明,与先进的深度强化学习进行比较,在多种障碍物的情况下,采用CSDQN方法相较于DQN和SDQN方法分别提升了43.9%和4.2%的安全性,以及17.8%和3.7%的稳定性。 展开更多
关键词 无人驾驶 深度Q网络 分类避障 评估函数 安全性
下载PDF
基于DQN的多智能体深度强化学习运动规划方法 被引量:2
6
作者 史殿习 彭滢璇 +3 位作者 杨焕焕 欧阳倩滢 张玉晖 郝锋 《计算机科学》 CSCD 北大核心 2024年第2期268-277,共10页
DQN方法作为经典的基于价值的深度强化学习方法,在多智能体运动规划等领域得到了广泛应用。然而,DQN方法面临一系列挑战,例如,DQN会过高估计Q值,计算Q值较为复杂,神经网络没有历史记忆能力,使用ε-greedy策略进行探索效率较低等。针对... DQN方法作为经典的基于价值的深度强化学习方法,在多智能体运动规划等领域得到了广泛应用。然而,DQN方法面临一系列挑战,例如,DQN会过高估计Q值,计算Q值较为复杂,神经网络没有历史记忆能力,使用ε-greedy策略进行探索效率较低等。针对这些问题,提出了一种基于DQN的多智能体深度强化学习运动规划方法,该方法可以帮助智能体学习到高效稳定的运动规划策略,无碰撞地到达目标点。首先,在DQN方法的基础上,提出了基于Dueling的Q值计算优化机制,将Q值的计算方式改进为计算状态值和优势函数值,并根据当前正在更新的Q值网络的参数选择最优动作,使得Q值的计算更加简单准确;其次,提出了基于GRU的记忆机制,引入了GRU模块,使得网络可以捕捉时序信息,具有处理智能体历史信息的能力;最后,提出了基于噪声的有效探索机制,通过引入参数化的噪声,改变了DQN中的探索方式,提高了智能体的探索效率,使得多智能体系统达到探索-利用的平衡状态。在PyBullet仿真平台的6种不同的仿真场景中进行了测试,实验结果表明,所提方法可以使多智能体团队进行高效协作,无碰撞地到达各自目标点,且策略训练过程稳定。 展开更多
关键词 多智能体系统 运动规划 深度强化学习 dqn方法
下载PDF
基于DQN和功率分配的FDA-MIMO雷达抗扫频干扰
7
作者 周长霖 王春阳 +3 位作者 宫健 谭铭 包磊 刘明杰 《雷达科学与技术》 北大核心 2024年第2期155-160,169,共7页
频率分集阵列(Frequency Diversity Array,FDA)雷达由于其阵列元件的频率增量产生了许多新的特性,包括其可以通过发射功率分配进行灵活的发射波形频谱控制。在以扫频干扰为电磁干扰环境的假设下,首先,通过引入强化学习的框架,建立了频... 频率分集阵列(Frequency Diversity Array,FDA)雷达由于其阵列元件的频率增量产生了许多新的特性,包括其可以通过发射功率分配进行灵活的发射波形频谱控制。在以扫频干扰为电磁干扰环境的假设下,首先,通过引入强化学习的框架,建立了频率分集阵列-多输入多输出(Frequency Diversity Array-Multiple Input Multiple Output,FDA-MIMO)雷达与电磁干扰环境交互模型,使得FDA-MIMO雷达能够在与电磁环境交互过程中,感知干扰抑制干扰。其次,本文提出了一种基于深度Q网络(Deep Q-Network,DQN)和FDA-MIMO雷达发射功率分配的扫频干扰抑制方法,使得雷达系统能够在充分利用频谱资源的情况下最大化SINR。最后,仿真结果证实,在强化学习框架下,FDA-MIMO雷达能够通过对发射功率分配进行优化,完成干扰抑制,提升雷达性能。 展开更多
关键词 频率分集阵列 扫频干扰 强化学习 深度Q网络 功率分配
下载PDF
基于Dueling Double DQN的交通信号控制方法
8
作者 叶宝林 陈栋 +2 位作者 刘春元 陈滨 吴维敏 《计算机测量与控制》 2024年第7期154-161,共8页
为了提高交叉口通行效率缓解交通拥堵,深入挖掘交通状态信息中所包含的深层次隐含特征信息,提出了一种基于Dueling Double DQN(D3QN)的单交叉口交通信号控制方法;构建了一个基于深度强化学习Double DQN(DDQN)的交通信号控制模型,对动作... 为了提高交叉口通行效率缓解交通拥堵,深入挖掘交通状态信息中所包含的深层次隐含特征信息,提出了一种基于Dueling Double DQN(D3QN)的单交叉口交通信号控制方法;构建了一个基于深度强化学习Double DQN(DDQN)的交通信号控制模型,对动作-价值函数的估计值和目标值迭代运算过程进行了优化,克服基于深度强化学习DQN的交通信号控制模型存在收敛速度慢的问题;设计了一个新的Dueling Network解耦交通状态和相位动作的价值,增强Double DQN(DDQN)提取深层次特征信息的能力;基于微观仿真平台SUMO搭建了一个单交叉口模拟仿真框架和环境,开展仿真测试;仿真测试结果表明,与传统交通信号控制方法和基于深度强化学习DQN的交通信号控制方法相比,所提方法能够有效减少车辆平均等待时间、车辆平均排队长度和车辆平均停车次数,明显提升交叉口通行效率。 展开更多
关键词 交通信号控制 深度强化学习 Dueling Double dqn Dueling Network
下载PDF
基于改进DQN的移动机器人避障路径规划 被引量:1
9
作者 田箫源 董秀成 《中国惯性技术学报》 EI CSCD 北大核心 2024年第4期406-416,共11页
针对一般强化学习方法下机器人在避障路径规划上学习时间长、探索能力差和奖励稀疏等问题,提出了一种基于改进深度Q网络(DQN)的移动机器人避障路径规划。首先在传统DQN算法基础上设计了障碍学习规则,避免对同一障碍重复学习,提升学习效... 针对一般强化学习方法下机器人在避障路径规划上学习时间长、探索能力差和奖励稀疏等问题,提出了一种基于改进深度Q网络(DQN)的移动机器人避障路径规划。首先在传统DQN算法基础上设计了障碍学习规则,避免对同一障碍重复学习,提升学习效率和成功率。其次提出奖励优化方法,利用状态间的访问次数差异给予奖励,平衡状态点的访问次数,避免过度访问;同时通过计算与目标点的欧氏距离,使其偏向于选择接近目标的路径,并取消远离目标惩罚,实现奖励机制的自适应优化。最后设计了动态探索因子函数,在后期训练中侧重利用强化学习策略选取动作和学习,提高算法性能和学习效率。实验仿真结果显示,与传统DQN算法相比,改进算法在训练时间上缩短了40.25%,避障成功率上提升了79.8%以及路径长度上缩短了2.25%,均体现了更好的性能。 展开更多
关键词 移动机器人 dqn算法 路径规划 避障 深度强化学习
下载PDF
UAV Autonomous Navigation for Wireless Powered Data Collection with Onboard Deep Q-Network
10
作者 LI Yuting DING Yi +3 位作者 GAO Jiangchuan LIU Yusha HU Jie YANG Kun 《ZTE Communications》 2023年第2期80-87,共8页
In a rechargeable wireless sensor network,utilizing the unmanned aerial vehicle(UAV)as a mobile base station(BS)to charge sensors and collect data effectively prolongs the network’s lifetime.In this paper,we jointly ... In a rechargeable wireless sensor network,utilizing the unmanned aerial vehicle(UAV)as a mobile base station(BS)to charge sensors and collect data effectively prolongs the network’s lifetime.In this paper,we jointly optimize the UAV’s flight trajectory and the sensor selection and operation modes to maximize the average data traffic of all sensors within a wireless sensor network(WSN)during finite UAV’s flight time,while ensuring the energy required for each sensor by wireless power transfer(WPT).We consider a practical scenario,where the UAV has no prior knowledge of sensor locations.The UAV performs autonomous navigation based on the status information obtained within the coverage area,which is modeled as a Markov decision process(MDP).The deep Q-network(DQN)is employed to execute the navigation based on the UAV position,the battery level state,channel conditions and current data traffic of sensors within the UAV’s coverage area.Our simulation results demonstrate that the DQN algorithm significantly improves the network performance in terms of the average data traffic and trajectory design. 展开更多
关键词 unmanned aerial vehicle wireless power transfer deep q-network autonomous navigation
下载PDF
基于DQN算法的农用无人车作业路径规划
11
作者 庄金炜 张晓菲 +1 位作者 尹琪东 陈克 《沈阳理工大学学报》 CAS 2024年第4期32-37,共6页
传统农用无人车作业时常依据人工经验确定作业路线,面对复杂的作业环境时无法保证路径规划的高效性,且传统覆盖路径规划方法聚焦于覆盖率而忽略了车辆作业路线上的损耗。为此,提出一种以减少车辆在路线上的损耗为目标的最优全局覆盖路... 传统农用无人车作业时常依据人工经验确定作业路线,面对复杂的作业环境时无法保证路径规划的高效性,且传统覆盖路径规划方法聚焦于覆盖率而忽略了车辆作业路线上的损耗。为此,提出一种以减少车辆在路线上的损耗为目标的最优全局覆盖路径规划方法。以深度Q网络(DQN)算法为基础,根据作业时车辆的真实轨迹创建奖励策略(RLP),对车辆在路线上的损耗进行优化,减少车辆的转弯数、掉头数及重复作业面积,设计了RLP-DQN算法。仿真实验结果表明,对比遗传算法、A~*算法等传统路径规划方法,本文RLP-DQN算法综合性能较好,可在实现全覆盖路径规划的同时有效减少路线损耗。 展开更多
关键词 农用无人车 路径规划 深度强化学习 dqn算法
下载PDF
未知环境下基于Dueling DQN的无人机路径规划研究
12
作者 赵恬恬 孔建国 +1 位作者 梁海军 刘晨宇 《现代计算机》 2024年第5期37-43,共7页
为有效解决无人机在未知环境下的路径规划问题,提出一种基于Dueling DQN的路径规划方法。首先,在DQN的基础上,引入对抗网络架构,从而更好地提高成功率;其次,设计状态空间并定义离散化的动作和适当的奖励函数以引导无人机学习最优路径;... 为有效解决无人机在未知环境下的路径规划问题,提出一种基于Dueling DQN的路径规划方法。首先,在DQN的基础上,引入对抗网络架构,从而更好地提高成功率;其次,设计状态空间并定义离散化的动作和适当的奖励函数以引导无人机学习最优路径;最后在仿真环境中对DQN和Dueling DQN展开训练,结果表明:①Dueling DQN能规划出未知环境下从初始点到目标点的无碰撞路径,且能获得更高的奖励值;②经过50000次训练,Dueling DQN的成功率比DQN提高17.71%,碰撞率减少1.57%,超过最长步长率降低16.14%。 展开更多
关键词 无人机 路径规划 深度强化学习 Dueling dqn算法
下载PDF
Double DQN Method For Botnet Traffic Detection System
13
作者 Yutao Hu Yuntao Zhao +1 位作者 Yongxin Feng Xiangyu Ma 《Computers, Materials & Continua》 SCIE EI 2024年第4期509-530,共22页
In the face of the increasingly severe Botnet problem on the Internet,how to effectively detect Botnet traffic in realtime has become a critical problem.Although the existing deepQnetwork(DQN)algorithminDeep reinforce... In the face of the increasingly severe Botnet problem on the Internet,how to effectively detect Botnet traffic in realtime has become a critical problem.Although the existing deepQnetwork(DQN)algorithminDeep reinforcement learning can solve the problem of real-time updating,its prediction results are always higher than the actual results.In Botnet traffic detection,although it performs well in the training set,the accuracy rate of predicting traffic is as high as%;however,in the test set,its accuracy has declined,and it is impossible to adjust its prediction strategy on time based on new data samples.However,in the new dataset,its accuracy has declined significantly.Therefore,this paper proposes a Botnet traffic detection system based on double-layer DQN(DDQN).Two Q-values are designed to adjust the model in policy and action,respectively,to achieve real-time model updates and improve the universality and robustness of the model under different data sets.Experiments show that compared with the DQN model,when using DDQN,the Q-value is not too high,and the detectionmodel has improved the accuracy and precision of Botnet traffic.Moreover,when using Botnet data sets other than the test set,the accuracy and precision of theDDQNmodel are still higher than DQN. 展开更多
关键词 dqn Ddqn deep reinforcement learning botnet detection feature classification
下载PDF
基于改进DQN的动态避障路径规划
14
作者 郑晨炜 侯凌燕 +2 位作者 王超 赵青娟 邹智元 《北京信息科技大学学报(自然科学版)》 2024年第5期14-22,共9页
针对传统深度Q学习网络(deep Q-learning network,DQN)在具有动态障碍物的路径规划下,移动机器人在探索时频繁碰撞难以移动至目标点的问题,通过在探索策略和经验回放机制上进行改进,提出一种改进的DQN算法。在探索策略上,利用快速搜索... 针对传统深度Q学习网络(deep Q-learning network,DQN)在具有动态障碍物的路径规划下,移动机器人在探索时频繁碰撞难以移动至目标点的问题,通过在探索策略和经验回放机制上进行改进,提出一种改进的DQN算法。在探索策略上,利用快速搜索随机树(rapidly-exploring random tree,RRT)算法自动生成静态先验知识来指导动作选取,替代ε-贪婪策略的随机动作,提高智能体到达目标的成功率;在经验利用上,使用K-means算法设计一种聚类经验回放机制,根据动态障碍物的位置信息进行聚类分簇,着重采样与当前智能体状态相似的经验进行回放,使智能体更有效地避免碰撞动态障碍物。二维栅格化环境下的仿真实验表明,在动态环境下,该算法可以避开静态和动态障碍物,成功移动至目标点,验证了该算法在应对动态避障路径规划的可行性。 展开更多
关键词 动态环境 路径规划 深度Q学习网络 避障 经验回放
下载PDF
Automatic depth matching method of well log based on deep reinforcement learning
15
作者 XIONG Wenjun XIAO Lizhi +1 位作者 YUAN Jiangru YUE Wenzheng 《Petroleum Exploration and Development》 SCIE 2024年第3期634-646,共13页
In the traditional well log depth matching tasks,manual adjustments are required,which means significantly labor-intensive for multiple wells,leading to low work efficiency.This paper introduces a multi-agent deep rei... In the traditional well log depth matching tasks,manual adjustments are required,which means significantly labor-intensive for multiple wells,leading to low work efficiency.This paper introduces a multi-agent deep reinforcement learning(MARL)method to automate the depth matching of multi-well logs.This method defines multiple top-down dual sliding windows based on the convolutional neural network(CNN)to extract and capture similar feature sequences on well logs,and it establishes an interaction mechanism between agents and the environment to control the depth matching process.Specifically,the agent selects an action to translate or scale the feature sequence based on the double deep Q-network(DDQN).Through the feedback of the reward signal,it evaluates the effectiveness of each action,aiming to obtain the optimal strategy and improve the accuracy of the matching task.Our experiments show that MARL can automatically perform depth matches for well-logs in multiple wells,and reduce manual intervention.In the application to the oil field,a comparative analysis of dynamic time warping(DTW),deep Q-learning network(DQN),and DDQN methods revealed that the DDQN algorithm,with its dual-network evaluation mechanism,significantly improves performance by identifying and aligning more details in the well log feature sequences,thus achieving higher depth matching accuracy. 展开更多
关键词 artificial intelligence machine learning depth matching well log multi-agent deep reinforcement learning convolutional neural network double deep q-network
下载PDF
Associative Tasks Computing Offloading Scheme in Internet of Medical Things with Deep Reinforcement Learning
16
作者 Jiang Fan Qin Junwei +1 位作者 Liu Lei Tian Hui 《China Communications》 SCIE CSCD 2024年第4期38-52,共15页
The Internet of Medical Things(Io MT) is regarded as a critical technology for intelligent healthcare in the foreseeable 6G era. Nevertheless, due to the limited computing power capability of edge devices and task-rel... The Internet of Medical Things(Io MT) is regarded as a critical technology for intelligent healthcare in the foreseeable 6G era. Nevertheless, due to the limited computing power capability of edge devices and task-related coupling relationships, Io MT faces unprecedented challenges. Considering the associative connections among tasks, this paper proposes a computing offloading policy for multiple-user devices(UDs) considering device-to-device(D2D) communication and a multi-access edge computing(MEC)technique under the scenario of Io MT. Specifically,to minimize the total delay and energy consumption concerning the requirement of Io MT, we first analyze and model the detailed local execution, MEC execution, D2D execution, and associated tasks offloading exchange model. Consequently, the associated tasks’ offloading scheme of multi-UDs is formulated as a mixed-integer nonconvex optimization problem. Considering the advantages of deep reinforcement learning(DRL) in processing tasks related to coupling relationships, a Double DQN based associative tasks computing offloading(DDATO) algorithm is then proposed to obtain the optimal solution, which can make the best offloading decision under the condition that tasks of UDs are associative. Furthermore, to reduce the complexity of the DDATO algorithm, the cacheaided procedure is intentionally introduced before the data training process. This avoids redundant offloading and computing procedures concerning tasks that previously have already been cached by other UDs. In addition, we use a dynamic ε-greedy strategy in the action selection section of the algorithm, thus preventing the algorithm from falling into a locally optimal solution. Simulation results demonstrate that compared with other existing methods for associative task models concerning different structures in the Io MT network, the proposed algorithm can lower the total cost more effectively and efficiently while also providing a tradeoff between delay and energy consumption tolerance. 展开更多
关键词 associative tasks cache-aided procedure double deep q-network Internet of Medical Things(IoMT) multi-access edge computing(MEC)
下载PDF
Multi-Agent Deep Q-Networks for Efficient Edge Federated Learning Communications in Software-Defined IoT
17
作者 Prohim Tam Sa Math +1 位作者 Ahyoung Lee Seokhoon Kim 《Computers, Materials & Continua》 SCIE EI 2022年第5期3319-3335,共17页
Federated learning(FL)activates distributed on-device computation techniques to model a better algorithm performance with the interaction of local model updates and global model distributions in aggregation averaging ... Federated learning(FL)activates distributed on-device computation techniques to model a better algorithm performance with the interaction of local model updates and global model distributions in aggregation averaging processes.However,in large-scale heterogeneous Internet of Things(IoT)cellular networks,massive multi-dimensional model update iterations and resource-constrained computation are challenging aspects to be tackled significantly.This paper introduces the system model of converging softwaredefined networking(SDN)and network functions virtualization(NFV)to enable device/resource abstractions and provide NFV-enabled edge FL(eFL)aggregation servers for advancing automation and controllability.Multi-agent deep Q-networks(MADQNs)target to enforce a self-learning softwarization,optimize resource allocation policies,and advocate computation offloading decisions.With gathered network conditions and resource states,the proposed agent aims to explore various actions for estimating expected longterm rewards in a particular state observation.In exploration phase,optimal actions for joint resource allocation and offloading decisions in different possible states are obtained by maximum Q-value selections.Action-based virtual network functions(VNF)forwarding graph(VNFFG)is orchestrated to map VNFs towards eFL aggregation server with sufficient communication and computation resources in NFV infrastructure(NFVI).The proposed scheme indicates deficient allocation actions,modifies the VNF backup instances,and reallocates the virtual resource for exploitation phase.Deep neural network(DNN)is used as a value function approximator,and epsilongreedy algorithm balances exploration and exploitation.The scheme primarily considers the criticalities of FL model services and congestion states to optimize long-term policy.Simulation results presented the outperformance of the proposed scheme over reference schemes in terms of Quality of Service(QoS)performance metrics,including packet drop ratio,packet drop counts,packet delivery ratio,delay,and throughput. 展开更多
关键词 deep q-networks federated learning network functions virtualization quality of service software-defined networking
下载PDF
基于改进DQN算法的考虑船舶配载图的翻箱问题研究
18
作者 梁承姬 花跃 王钰 《重庆交通大学学报(自然科学版)》 CAS CSCD 北大核心 2024年第9期43-49,77,共8页
为了满足船舶配载图的要求,减少场桥翻箱次数,提高码头运行效率,对考虑船舶配载图的集装箱翻箱问题进行了研究。此问题是在传统集装箱翻箱问题的基础上,又考虑到船舶配载图对翻箱的影响。为了求解此问题的最小翻箱次数,设计了DQN算法进... 为了满足船舶配载图的要求,减少场桥翻箱次数,提高码头运行效率,对考虑船舶配载图的集装箱翻箱问题进行了研究。此问题是在传统集装箱翻箱问题的基础上,又考虑到船舶配载图对翻箱的影响。为了求解此问题的最小翻箱次数,设计了DQN算法进行求解,同时为了提高算法求解的性能,又在原算法的基础上设计了基于启发式算法的阈值和全新的奖励函数以改进算法。通过与其它文献中的实验结果进行对比,结果显示:在计算结果上,改进的DQN算法在各个算例上的结果均优于目前各个启发式算法的最优结果,并且规模越大,结果越好;在训练时间上,改进的DQN算法极大的优于未改进的DQN算法,并且规模越大,节省的时间也更显著。 展开更多
关键词 交通运输工程 海运 集装箱翻箱 船舶配载图 dqn算法
下载PDF
基于Double DQN的双模式多目标信号配时方法
19
作者 聂雷 张明萱 +1 位作者 黄庆涵 鲍海洲 《计算机技术与发展》 2024年第8期143-150,共8页
近年来深度强化学习作为一种高效可靠的机器学习方法被广泛应用在交通信号控制领域。目前,现有交通信号配时方法通常忽略了特殊车辆(例如救护车、消防车等)的优先通行;此外,基于传统深度强化学习的信号配时方法优化目标较为单一,导致其... 近年来深度强化学习作为一种高效可靠的机器学习方法被广泛应用在交通信号控制领域。目前,现有交通信号配时方法通常忽略了特殊车辆(例如救护车、消防车等)的优先通行;此外,基于传统深度强化学习的信号配时方法优化目标较为单一,导致其在复杂交通场景中性能不佳。针对上述问题,基于Double DQN提出一种融合特殊车辆优先通行的双模式多目标信号配时方法(Dual-mode Multi-objective signal timing method based on Double DQN,DMDD),以提高不同交通场景下路口的通行效率。该方法首先基于路口的饱和状态选择信号控制模式,特殊车辆在紧急控制模式下被赋予更高的通行权重,有利于其更快通过路口;接着针对等待时长、队列长度和CO 2排放量3个指标分别设计神经网络进行奖励计算;最后利用Double DQN进行最优信号相位的选择,通过灵活切换信号相位以提升通行效率。基于SUMO的实验结果表明,DMDD与对比方法相比能有效缩短路口处特殊车辆的等待时长、队列长度和CO 2排放量,特殊车辆能够更快通过路口,有效地提高了通行效率。 展开更多
关键词 交通信号配时 深度强化学习 双模式多目标 Double dqn SUMO
下载PDF
基于改进DQN算法的陶瓷梭式窑温度智能控制
20
作者 朱永红 余英剑 李蔓华 《中国陶瓷工业》 CAS 2024年第5期33-38,共6页
针对陶瓷梭式窑大延迟、非线性、慢时变及强耦合等特点,提出了基于改进DQN算法的陶瓷梭式窑温度智能控制方法。首先,建立了基于BP神经网络的陶瓷梭式窑模型。然后,提出了基于改进DQN算法的智能控制方法。最后,对所提出的方法进行了仿真... 针对陶瓷梭式窑大延迟、非线性、慢时变及强耦合等特点,提出了基于改进DQN算法的陶瓷梭式窑温度智能控制方法。首先,建立了基于BP神经网络的陶瓷梭式窑模型。然后,提出了基于改进DQN算法的智能控制方法。最后,对所提出的方法进行了仿真研究。仿真结果表明,改进的PRDQN算法的温度控制相对误差为0℃~5℃,温度控制效果相对较好。因此,所提出的方法是有效且可行的。 展开更多
关键词 陶瓷梭式窑 深度强化学习 BP神经网络 PRdqn算法
下载PDF
上一页 1 2 8 下一页 到第
使用帮助 返回顶部