期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Hierarchical Reinforcement Learning With Automatic Sub-Goal Identification 被引量:1
1
作者 Chenghao Liu Fei Zhu +1 位作者 Quan Liu Yuchen Fu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第10期1686-1696,共11页
In reinforcement learning an agent may explore ineffectively when dealing with sparse reward tasks where finding a reward point is difficult.To solve the problem,we propose an algorithm called hierarchical deep reinfo... In reinforcement learning an agent may explore ineffectively when dealing with sparse reward tasks where finding a reward point is difficult.To solve the problem,we propose an algorithm called hierarchical deep reinforcement learning with automatic sub-goal identification via computer vision(HADS)which takes advantage of hierarchical reinforcement learning to alleviate the sparse reward problem and improve efficiency of exploration by utilizing a sub-goal mechanism.HADS uses a computer vision method to identify sub-goals automatically for hierarchical deep reinforcement learning.Due to the fact that not all sub-goal points are reachable,a mechanism is proposed to remove unreachable sub-goal points so as to further improve the performance of the algorithm.HADS involves contour recognition to identify sub-goals from the state image where some salient states in the state image may be recognized as sub-goals,while those that are not will be removed based on prior knowledge.Our experiments verified the effect of the algorithm. 展开更多
关键词 hierarchical control hierarchical reinforcement learning OPTION sparse reward sub-goal
下载PDF
Hierarchical reinforcement learning guidance with threat avoidance
2
作者 LI Bohao WU Yunjie LI Guofei 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2022年第5期1173-1185,共13页
The guidance strategy is an extremely critical factor in determining the striking effect of the missile operation.A novel guidance law is presented by exploiting the deep reinforcement learning(DRL)with the hierarchic... The guidance strategy is an extremely critical factor in determining the striking effect of the missile operation.A novel guidance law is presented by exploiting the deep reinforcement learning(DRL)with the hierarchical deep deterministic policy gradient(DDPG)algorithm.The reward functions are constructed to minimize the line-of-sight(LOS)angle rate and avoid the threat caused by the opposed obstacles.To attenuate the chattering of the acceleration,a hierarchical reinforcement learning structure and an improved reward function with action penalty are put forward.The simulation results validate that the missile under the proposed method can hit the target successfully and keep away from the threatened areas effectively. 展开更多
关键词 guidance law deep reinforcement learning(DRL) threat avoidance hierarchical reinforcement learning
下载PDF
Hierarchical Reinforcement Learning Adversarial Algorithm Against Opponent with Fixed Offensive Strategy
3
作者 赵英策 张广浩 +1 位作者 邢正宇 李建勋 《Journal of Shanghai Jiaotong university(Science)》 EI 2024年第3期471-479,共9页
Based on option-critic algorithm,a new adversarial algorithm named deterministic policy network with option architecture is proposed to improve agent's performance against opponent with fixed offensive algorithm.A... Based on option-critic algorithm,a new adversarial algorithm named deterministic policy network with option architecture is proposed to improve agent's performance against opponent with fixed offensive algorithm.An option network is introduced in upper level design,which can generate activated signal from defensive and of-fensive strategies according to temporary situation.Then the lower level executive layer can figure out interactive action with guidance of activated signal,and the value of both activated signal and interactive action is evaluated by critic structure together.This method could release requirement of semi Markov decision process effectively and eventually simplified network structure by eliminating termination possibility layer.According to the result of experiment,it is proved that new algorithm switches strategy style between offensive and defensive ones neatly and acquires more reward from environment than classical deep deterministic policy gradient algorithm does. 展开更多
关键词 hierarchical reinforcement learning fixed offensive strategy option architecture deterministic gradi-entpolicy
原文传递
Autonomous Overtaking for Intelligent Vehicles Considering Social Preference Based on Hierarchical Reinforcement Learning 被引量:6
4
作者 Hongliang Lu Chao Lu +2 位作者 Yang Yu Guangming Xiong Jianwei Gong 《Automotive Innovation》 EI CSCD 2022年第2期195-208,共14页
As intelligent vehicles usually have complex overtaking process,a safe and efficient automated overtaking system(AOS)is vital to avoid accidents caused by wrong operation of drivers.Existing AOSs rarely consider longi... As intelligent vehicles usually have complex overtaking process,a safe and efficient automated overtaking system(AOS)is vital to avoid accidents caused by wrong operation of drivers.Existing AOSs rarely consider longitudinal reactions of the overtaken vehicle(OV)during overtaking.This paper proposed a novel AOS based on hierarchical reinforcement learning,where the longitudinal reaction is given by a data-driven social preference estimation.This AOS incorporates two modules that can function in different overtaking phases.The first module based on semi-Markov decision process and motion primitives is built for motion planning and control.The second module based on Markov decision process is designed to enable vehicles to make proper decisions according to the social preference of OV.Based on realistic overtaking data,the proposed AOS and its modules are verified experimentally.The results of the tests show that the proposed AOS can realize safe and effective overtaking in scenes built by realistic data,and has the ability to flexibly adjust lateral driving behavior and lane changing position when the OVs have different social preferences. 展开更多
关键词 Automated overtaking system Semi-Markov decision process hierarchical reinforcement learning Social preference
原文传递
Autonomic discovery of subgoals in hierarchical reinforcement learning 被引量:1
5
作者 XIAO Ding LI Yi-tong SHI Chuan 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2014年第5期94-104,共11页
Option is a promising method to discover the hierarchical structure in reinforcement learning (RL) for learning acceleration. The key to option discovery is about how an agent can find useful subgoals autonomically ... Option is a promising method to discover the hierarchical structure in reinforcement learning (RL) for learning acceleration. The key to option discovery is about how an agent can find useful subgoals autonomically among the passing trails. By analyzing the agent's actions in the trails, useful heuristics can be found. Not only does the agent pass subgoals more frequently, but also its effective actions are restricted in subgoals. As a consequence, the subgoals can be deemed as the most matching action-restricted states in the paths. In the grid-world environment, the concept of the unique-direction value reflecting the action-restricted property was introduced to find the most matching action-restricted states. The unique-direction-value (UDV) approach is chosen to form options offline and online autonomically. Experiments show that the approach can find subgoals correctly. Thus the Q-learning with options found on both offline and online process can accelerate learning significantly. 展开更多
关键词 hierarchical reinforcement learning OPTION Q-learning SUBGOAL UDV
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部