期刊文献+
共找到13篇文章
< 1 >
每页显示 20 50 100
基于Deep Q Networks的交通指示灯控制方法 被引量:2
1
作者 颜文胜 吕红兵 《计算机测量与控制》 2021年第6期93-97,共5页
交通指示灯的智能控制是当前智能交通研究中的热点问题;为更加及时有效地自适应动态交通,进一步提升街道路口车流效率,提出了一种基于Deep Q Networks的道路指示灯控制方法;该方法基于道路指示灯控制问题描述,以状态、行动和奖励三要素... 交通指示灯的智能控制是当前智能交通研究中的热点问题;为更加及时有效地自适应动态交通,进一步提升街道路口车流效率,提出了一种基于Deep Q Networks的道路指示灯控制方法;该方法基于道路指示灯控制问题描述,以状态、行动和奖励三要素构建道路指示灯控制的强化学习模型,提出基于Deep Q Networks的道路指示控制方法流程;为检验方法的有效性,以浙江省台州市市府大道与东环大道交叉路口交通数据在SUMO中进行方法比对与仿真实验;实验结果表明,基于Deep Q Networks的交通指示灯控制方法在交通指示等的控制与调度中具有更高的效率和自主性,更有利于改善路口车流的吞吐量,对道路路口车流的驻留时延、队列长度和等待时间等方面的优化具有更好的性能。 展开更多
关键词 道路指示灯 deep q networks 智能交通 信号控制
下载PDF
基于Deep Q Networks的机械臂推动和抓握协同控制 被引量:2
2
作者 贺道坤 《现代制造工程》 CSCD 北大核心 2021年第7期23-28,共6页
针对目前机械臂在复杂场景应用不足以及推动和抓握自主协同控制研究不多的现状,发挥深度Q网络(Deep Q Networks)无规则、自主学习优势,提出了一种基于Deep Q Networks的机械臂推动和抓握协同控制方法。通过2个完全卷积网络将场景信息映... 针对目前机械臂在复杂场景应用不足以及推动和抓握自主协同控制研究不多的现状,发挥深度Q网络(Deep Q Networks)无规则、自主学习优势,提出了一种基于Deep Q Networks的机械臂推动和抓握协同控制方法。通过2个完全卷积网络将场景信息映射至推动或抓握动作,经过马尔可夫过程,采取目光长远奖励机制,选取最佳行为函数,实现对复杂场景机械臂推动和抓握动作的自主协同控制。在仿真和真实场景实验中,该方法在复杂场景中能够通过推动和抓握自主协同操控实现对物块的快速抓取,并获得更高的动作效率和抓取成功率。 展开更多
关键词 机械臂 抓握 推动 深度q网络(deep q networks) 协同控制
下载PDF
Artificial Potential Field Incorporated Deep-Q-Network Algorithm for Mobile Robot Path Prediction
3
作者 A.Sivaranjani B.Vinod 《Intelligent Automation & Soft Computing》 SCIE 2023年第1期1135-1150,共16页
Autonomous navigation of mobile robots is a challenging task that requires them to travel from their initial position to their destination without collision in an environment.Reinforcement Learning methods enable a st... Autonomous navigation of mobile robots is a challenging task that requires them to travel from their initial position to their destination without collision in an environment.Reinforcement Learning methods enable a state action function in mobile robots suited to their environment.During trial-and-error interaction with its surroundings,it helps a robot tofind an ideal behavior on its own.The Deep Q Network(DQN)algorithm is used in TurtleBot 3(TB3)to achieve the goal by successfully avoiding the obstacles.But it requires a large number of training iterations.This research mainly focuses on a mobility robot’s best path prediction utilizing DQN and the Artificial Potential Field(APF)algorithms.First,a TB3 Waffle Pi DQN is built and trained to reach the goal.Then the APF shortest path algorithm is incorporated into the DQN algorithm.The proposed planning approach is compared with the standard DQN method in a virtual environment based on the Robot Operation System(ROS).The results from the simulation show that the combination is effective for DQN and APF gives a better optimal path and takes less time when compared to the conventional DQN algo-rithm.The performance improvement rate of the proposed DQN+APF in comparison with DQN in terms of the number of successful targets is attained by 88%.The performance of the proposed DQN+APF in comparison with DQN in terms of average time is achieved by 0.331 s.The performance of the proposed DQN+APF in comparison with DQN average rewards in which the positive goal is attained by 85%and the negative goal is attained by-90%. 展开更多
关键词 Artificial potentialfield deep reinforcement learning mobile robot turtle bot deep q network path prediction
下载PDF
DQN-Based Proactive Trajectory Planning of UAVs in Multi-Access Edge Computing
4
作者 Adil Khan Jinling Zhang +3 位作者 Shabeer Ahmad Saifullah Memon Babar Hayat Ahsan Rafiq 《Computers, Materials & Continua》 SCIE EI 2023年第3期4685-4702,共18页
The main aim of future mobile networks is to provide secure,reliable,intelligent,and seamless connectivity.It also enables mobile network operators to ensure their customer’s a better quality of service(QoS).Nowadays... The main aim of future mobile networks is to provide secure,reliable,intelligent,and seamless connectivity.It also enables mobile network operators to ensure their customer’s a better quality of service(QoS).Nowadays,Unmanned Aerial Vehicles(UAVs)are a significant part of the mobile network due to their continuously growing use in various applications.For better coverage,cost-effective,and seamless service connectivity and provisioning,UAVs have emerged as the best choice for telco operators.UAVs can be used as flying base stations,edge servers,and relay nodes in mobile networks.On the other side,Multi-access EdgeComputing(MEC)technology also emerged in the 5G network to provide a better quality of experience(QoE)to users with different QoS requirements.However,UAVs in a mobile network for coverage enhancement and better QoS face several challenges such as trajectory designing,path planning,optimization,QoS assurance,mobilitymanagement,etc.The efficient and proactive path planning and optimization in a highly dynamic environment containing buildings and obstacles are challenging.So,an automated Artificial Intelligence(AI)enabled QoSaware solution is needed for trajectory planning and optimization.Therefore,this work introduces a well-designed AI and MEC-enabled architecture for a UAVs-assisted future network.It has an efficient Deep Reinforcement Learning(DRL)algorithm for real-time and proactive trajectory planning and optimization.It also fulfills QoS-aware service provisioning.A greedypolicy approach is used to maximize the long-term reward for serving more users withQoS.Simulation results reveal the superiority of the proposed DRL mechanism for energy-efficient and QoS-aware trajectory planning over the existing models. 展开更多
关键词 Multi-access edge computing UAVS trajectory planning qoS assurance reinforcement learning deep q network
下载PDF
Deep reinforcement learning for UAV swarm rendezvous behavior
5
作者 ZHANG Yaozhong LI Yike +1 位作者 WU Zhuoran XU Jialin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2023年第2期360-373,共14页
The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the mai... The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the main trends of UAV development in the future.This paper studies the behavior decision-making process of UAV swarm rendezvous task based on the double deep Q network(DDQN)algorithm.We design a guided reward function to effectively solve the problem of algorithm convergence caused by the sparse return problem in deep reinforcement learning(DRL)for the long period task.We also propose the concept of temporary storage area,optimizing the memory playback unit of the traditional DDQN algorithm,improving the convergence speed of the algorithm,and speeding up the training process of the algorithm.Different from traditional task environment,this paper establishes a continuous state-space task environment model to improve the authentication process of UAV task environment.Based on the DDQN algorithm,the collaborative tasks of UAV swarm in different task scenarios are trained.The experimental results validate that the DDQN algorithm is efficient in terms of training UAV swarm to complete the given collaborative tasks while meeting the requirements of UAV swarm for centralization and autonomy,and improving the intelligence of UAV swarm collaborative task execution.The simulation results show that after training,the proposed UAV swarm can carry out the rendezvous task well,and the success rate of the mission reaches 90%. 展开更多
关键词 double deep q network(DDqN)algorithms unmanned aerial vehicle(UAV)swarm task decision deep reinforcement learning(DRL) sparse returns
下载PDF
Real-time UAV path planning based on LSTM network
6
作者 ZHANG Jiandong GUO Yukun +3 位作者 ZHENG Lihui YANG Qiming SHI Guoqing WU Yong 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第2期374-385,共12页
To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on... To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning. 展开更多
关键词 deep q network path planning neural network unmanned aerial vehicle(UAV) long short-term memory(LSTM)
下载PDF
基于深度学习的水风光短期随机优化调度研究
7
作者 张一凡 《水电与新能源》 2024年第3期34-37,共4页
我国致力于可再生能源发展,提出水-风-光多能互补系统,因风光能源的不确定性,需实时电网调度调整。文章运用深度学习(DQN)优化系统的短期调度,最大化发电效益。采用拉丁超立方抽样和考虑Kantorovich距离的场景削减技术,反映可再生能源... 我国致力于可再生能源发展,提出水-风-光多能互补系统,因风光能源的不确定性,需实时电网调度调整。文章运用深度学习(DQN)优化系统的短期调度,最大化发电效益。采用拉丁超立方抽样和考虑Kantorovich距离的场景削减技术,反映可再生能源不确定性分布,结合深度强化学习建立多能互补系统短期优化调度模型。模拟实际数据,显示该方法有效解决高维等问题,较于传统方法有显著优势。 展开更多
关键词 短期调度 不确定性 拉丁超立方抽样 场景削减 deep q Network
下载PDF
Safe Navigation for UAV-Enabled Data Dissemination by Deep Reinforcement Learning in Unknown Environments 被引量:1
8
作者 Fei Huang Guangxia Li +3 位作者 Shiwei Tian Jin Chen Guangteng Fan Jinghui Chang 《China Communications》 SCIE CSCD 2022年第1期202-217,共16页
Unmanned aerial vehicles(UAVs) are increasingly considered in safe autonomous navigation systems to explore unknown environments where UAVs are equipped with multiple sensors to perceive the surroundings. However, how... Unmanned aerial vehicles(UAVs) are increasingly considered in safe autonomous navigation systems to explore unknown environments where UAVs are equipped with multiple sensors to perceive the surroundings. However, how to achieve UAVenabled data dissemination and also ensure safe navigation synchronously is a new challenge. In this paper, our goal is minimizing the whole weighted sum of the UAV’s task completion time while satisfying the data transmission task requirement and the UAV’s feasible flight region constraints. However, it is unable to be solved via standard optimization methods mainly on account of lacking a tractable and accurate system model in practice. To overcome this tough issue,we propose a new solution approach by utilizing the most advanced dueling double deep Q network(dueling DDQN) with multi-step learning. Specifically, to improve the algorithm, the extra labels are added to the primitive states. Simulation results indicate the validity and performance superiority of the proposed algorithm under different data thresholds compared with two other benchmarks. 展开更多
关键词 Unmanned aerial vehicles(UAVs) safe autonomous navigation unknown environments data dissemination dueling double deep q network(dueling DDqN)
下载PDF
基于深度强化学习的固高直线一级倒立摆控制实验设计
9
作者 冯肖雪 谢天 +1 位作者 温岳 李位星 《科技资讯》 2023年第23期4-10,共7页
为适应各高校人工智能专业学生对于机器学习领域的学习需求,同时兼顾固高科技直线一级倒立摆控制系统可操作性、实时性和安全性,设计了一套基于深度强化学习的固高直线一级倒立摆控制实验方案。首先采用深度强化学习算法的无模型控制结... 为适应各高校人工智能专业学生对于机器学习领域的学习需求,同时兼顾固高科技直线一级倒立摆控制系统可操作性、实时性和安全性,设计了一套基于深度强化学习的固高直线一级倒立摆控制实验方案。首先采用深度强化学习算法的无模型控制结构搭建控制器并进行虚拟仿真实验。考虑倒立摆电机驱动刷新频率的限制以及提高样本处理速度,进一步设计了基于离线Q学习算法的平衡控制器实现倒立摆实物稳定控制。该实验方案既加深了学生对人工智能领域知识的理解,也适应了固高科技直线一级倒立摆的应用场景。 展开更多
关键词 直线一级倒立摆 深度强化学习 deep q Network算法 q学习算法
下载PDF
Situational continuity-based air combat autonomous maneuvering decision-making
10
作者 Jian-dong Zhang Yi-fei Yu +3 位作者 Li-hui Zheng Qi-ming Yang Guo-qing Shi Yong Wu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第11期66-79,共14页
In order to improve the performance of UAV's autonomous maneuvering decision-making,this paper proposes a decision-making method based on situational continuity.The algorithm in this paper designs a situation eval... In order to improve the performance of UAV's autonomous maneuvering decision-making,this paper proposes a decision-making method based on situational continuity.The algorithm in this paper designs a situation evaluation function with strong guidance,then trains the Long Short-Term Memory(LSTM)under the framework of Deep Q Network(DQN)for air combat maneuvering decision-making.Considering the continuity between adjacent situations,the method takes multiple consecutive situations as one input of the neural network.To reflect the difference between adjacent situations,the method takes the difference of situation evaluation value as the reward of reinforcement learning.In different scenarios,the algorithm proposed in this paper is compared with the algorithm based on the Fully Neural Network(FNN)and the algorithm based on statistical principles respectively.The results show that,compared with the FNN algorithm,the algorithm proposed in this paper is more accurate and forwardlooking.Compared with the algorithm based on the statistical principles,the decision-making of the algorithm proposed in this paper is more efficient and its real-time performance is better. 展开更多
关键词 UAV Maneuvering decision-making Situational continuity Long short-term memory(LSTM) deep q network(DqN) Fully neural network(FNN)
下载PDF
Tactical conflict resolution in urban airspace for unmanned aerial vehicles operations using attention-based deep reinforcement learning
11
作者 Mingcheng Zhang Chao Yan +2 位作者 Wei Dai Xiaojia Xiang Kin Huat Low 《Green Energy and Intelligent Transportation》 2023年第4期43-57,共15页
Unmanned aerial vehicles(UAVs)have gained much attention from academic and industrial areas due to the significant number of potential applications in urban airspace.A traffic management system for these UAVs is neede... Unmanned aerial vehicles(UAVs)have gained much attention from academic and industrial areas due to the significant number of potential applications in urban airspace.A traffic management system for these UAVs is needed to manage this future traffic.Tactical conflict resolution for unmanned aerial systems(UASs)is an essential piece of the puzzle for the future UAS Traffic Management(UTM),especially in very low-level(VLL)urban airspace.Unlike conflict resolution in higher altitude airspace,the dense high-rise buildings are an essential source of potential conflict to be considered in VLL urban airspace.In this paper,we propose an attention-based deep reinforcement learning approach to solve the tactical conflict resolution problem.Specifically,we formulate this task as a sequential decision-making problem using Markov Decision Process(MDP).The double deep Q network(DDQN)framework is used as a learning framework for the host drone to learn to output conflict-free maneuvers at each time step.We use the attention mechanism to model the individual neighbor's effect on the host drone,endowing the learned conflict resolution policy to be adapted to an arbitrary number of neighboring drones.Lastly,we build a simulation environment with various scenarios covering different types of encounters to evaluate the proposed approach.The simulation results demonstrate that our proposed algorithm provides a reliable solution to minimize secondary conflict counts compared to learning and non-learning-based approaches under different traffic density scenarios. 展开更多
关键词 Unmanned aircraft system traffic management Tactical conflict resolution Double deep q network Attention mechanism Secondary conflict
原文传递
Fault Identification in Power Network Based on Deep Reinforcement Learning 被引量:3
12
作者 Mengshi Li Huanming Zhang +1 位作者 Tianyao Ji Q.H.Wu 《CSEE Journal of Power and Energy Systems》 SCIE EI CSCD 2022年第3期721-731,共11页
With the integration of alternative energy and renewables,the issue of stability and resilience of the power network has received considerable attention.The basic necessity for fault diagnosis and isolation is fault i... With the integration of alternative energy and renewables,the issue of stability and resilience of the power network has received considerable attention.The basic necessity for fault diagnosis and isolation is fault identification and location.The conventional intelligent fault identification method needs supervision,manual labelling of characteristics,and requires large amounts of labelled data.To enhance the ability of intelligent methods and get rid of the dependence on a large amount of labelled data,a novel fault identification method based on deep reinforcement learning(DRL),which has not received enough attention in the field of fault identification,is investigated in this paper.The proposed method uses different faults as parameters of the model to expand the scope of fault identification.In addition,the DRL algorithm can intelligently modify the fault parameters according to the observations obtained from the power network environment,rather than requiring manual and mechanical tuning of parameters.The methodology was tested on the IEEE 14 bus for several scenarios and the performance of the proposed method was compared with that of population-based optimization methods and supervised learning methods.The obtained results have confirmed the feasibility and effectiveness of the proposed method. 展开更多
关键词 Artificial intelligence deep q network deep reinforcement learning fault diagnosis fault identification parameter identification power network
原文传递
Reusable electronic products value prediction based on reinforcement learning 被引量:1
13
作者 DU YongPing JIN XingNan +1 位作者 HAN HongGui WANG LuLin 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2022年第7期1578-1586,共9页
With the appearance of a huge number of reusable electronic products,the precise value evaluation has become an urgent problem to be solved in the recycling process.Traditional methods rely on manual intervention most... With the appearance of a huge number of reusable electronic products,the precise value evaluation has become an urgent problem to be solved in the recycling process.Traditional methods rely on manual intervention mostly.In order to make the model more suitable for the dynamic updating,this paper proposes the reinforcement learning based electronic products value prediction model which integrates market information to achieve timely and stable prediction results.The basic attributes and depreciation attributes of the product are modeled by two parallel neural networks separately to learn the different effects for prediction.Most importantly,the double deep Q network is adopted to fuse market information by reinforcement learning strategy,and the training on the old product data can be used to predict the following appeared product,which alleviates the cold start problem.Experiments on the real mobile phone recycling platform data verify that the model has achieved higher accuracy and it has a better generalization ability. 展开更多
关键词 electronic products value prediction reinforcement learning market factor deep q network
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部