GACS:Generative Adversarial Imitation Learning Based on Control Sharing

导出

摘要 Generative adversarial imitation learning(GAIL)directly imitates the behavior of experts from human demonstration instead of designing explicit reward signals like reinforcement learning.Meanwhile,GAIL overcomes the defects of traditional imitation learning by using a generative adversary network framework and shows excellent performance in many fields.However,GAIL directly acts on immediate rewards,a feature that is reflected in the value function after a period of accumulation.Thus,when faced with complex practical problems,the learning efficiency of GAIL is often extremely low and the policy may be slow to learn.One way to solve this problem is to directly guide the action(policy)in the agents'learning process,such as the control sharing(CS)method.This paper combines reinforcement learning and imitation learning and proposes a novel GAIL framework called generative adversarial imitation learning based on control sharing policy(GACS).GACS learns model constraints from expert samples and uses adversarial networks to guide learning directly.The actions are produced by adversarial networks and are used to optimize the policy and effectively improve learning efficiency.Experiments in the autonomous driving environment and the real-time strategy game breakout show that GACS has better generalization capabilities,more efficient imitation of the behavior of experts,and can learn better policies relative to other frameworks.

作者 Huaiwei SI Guozhen TAN Dongyu LI Yanfei PENG

机构地区 School of Computer Science and Technology

出处《Journal of Systems Science and Information》 CSCD 2023年第1期78-93,共16页 系统科学与信息学报（英文）

基金 Supported in Part by the National Natural Science Foundation of China (U1808206)。

关键词 generative adversarial imitation learning reinforcement learning control sharing deep reinforcement learning

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

1Erica Martin.INDIE VIRTUOSOS German Rockers The Notwist Embark on Their First ChinaTour[J].城市漫步（上海版、英文）,2017(5):44-45.
2Erica Martin.THE NOTWIST Endless Invention[J].城市漫步（GBA版）,2017(5):36-36.
3Shou-yi Li,Mou Chen,Yu-hui Wang,Qing-xian Wu.Air combat decision-making of multiple UCAVs based on constraint strategy games[J].Defence Technology（防务技术）,2022,18(3):368-383. 被引量：9
4Nawaf Hazim Barnouti,Sinan Sameer Mahmood Al-Dabbagh,Mustafa Abdul Sahib Naser.Pathfinding in Strategy Games and Maze Solving Using A* Search Algorithm[J].Journal of Computer and Communications,2016,4(11):15-25. 被引量：2
5汤润泽,衡勇,孙晓辉,高谦.RTS游戏智能体技术在防空武器中的应用[J].指挥与控制学报,2022,8(3):303-310.
6Qizhou Hu,Yikai Wu.Feasibility analysis of super-speed rail based on improved value function[J].Transportation Safety and Environment,2022,4(2):1-9. 被引量：1
7ZHANG Ming.A Strong Rebound of Chinese Economic Growth in 2023[J].China Forex,2023(1):18-22.
8Jiaqi GAO,Jingqi LI,Hongming SHAN,Yanyun QU,James ZWANG,Fei-Yue WANG,Junping ZHANG.Forget less,count better:a domain-incremental self-distillation learning benchmark for lifelong crowd counting[J].Frontiers of Information Technology & Electronic Engineering,2023,24(2):187-202.
9Anubha,Ravneet Preet Singh Bedi,Arfat Ahmad Khan,Mohd Anul Haq,Ahmad Alhussen,Zamil S.Alzamil.Efficient Optimal Routing Algorithm Based on Reward and Penalty for Mobile Adhoc Networks[J].Computers, Materials & Continua,2023(4):1331-1351.
10Samah Alhazmi,Shahnawaz Khan,Mohammad Haider Syed.Learning-Related Sentiment Detection, Classification, and Application for a Quality Education Using Artificial Intelligence Techniques[J].Intelligent Automation & Soft Computing,2023(6):3487-3499.

Journal of Systems Science and Information

2023年第1期

浏览历史

内容加载中请稍等...

GACS:Generative Adversarial Imitation Learning Based on Control Sharing

相关作者

相关机构

相关主题

浏览历史