Timesharing-tracking Framework for Decentralized Reinforcement Learning in Fully Cooperative Multi-agent System

下载PDF

导出

摘要 Dimension-reduced and decentralized learning i always viewed as an efficient way to solve multi-agent cooperative learning in high dimension. However, the dynamic environmen brought by the concurrent learning makes the decentralized learning hard to converge and bad in performance. To tackle thi problem, a timesharing-tracking framework(TTF), stemming from the idea that alternative learning in microscopic view results in concurrent learning in macroscopic view, is proposed in this paper, in which the joint-state best-response Q-learning(BRQ-learning) serves as the primary algorithm to adapt to the companions policies. With the properly defined switching principle, TTF makes all agents learn the best responses to others at different joint states. Thus from the view of the whole joint-state space, agents learn the optimal cooperative policy simultaneously. The simulation results illustrate that the proposed algorithm can learn the optimal joint behavior with les computation and faster speed compared with other two classica learning algorithms.

作者 Xin Chen Bo Fu Yong He Min Wu

机构地区 School of Automation School of Information Science and Engineering

出处《IEEE/CAA Journal of Automatica Sinica》 SCIE EI 2014年第2期127-133,共7页 自动化学报（英文版）

分类号 TP39 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1吴一帆,俞晓强.CAE软件操作小百科(31)[J].计算机辅助工程,2016,25(2):75-76.
2葛锁良,陈学军,邱丹.神经网络预测控制及其在二级倒立摆中的仿真[J].自动化技术与应用,2005,24(6):4-6. 被引量：7
3李茜,刘经纬,吕仁健,韩仲华.同步无线Mesh网络数据包连发技术研究[J].现代电子技术,2014,37(15):49-54. 被引量：2
4潘焦萍,周海,吴丽珍.一种新的高效的适合无线Mesh网络的密钥管理方案[J].西北师范大学学报（自然科学版）,2013,49(6):47-54. 被引量：2
5李凯,王兰.层次聚类的簇集成方法研究[J].计算机工程与应用,2010,46(27):120-123. 被引量：11
6再见，有你陪伴的旧时光——专访MV「Bront」导演Ari Gibson[J].数字娱乐技术,2011(10):64-66.
7Gyuho Eoh Kongwoo Lee Jeong H. Oh Seung-Hwan Lee Beom H. Lee.Cooperative Multiple-Object Towing with Linked Robots[J].通讯和计算机（中英文版）,2013,10(3):385-393.
8星星落在音乐的头上专访MV“Sometimes the Stars”导演Ari Gibson、美术总监Jason Pamment[J].数字娱乐技术,2011(5):36-40.
9ARI公司-STEVIVario技术解决方案[J].流程工业,2016(23):72-72.
10王楹,梁臣.星矢科技——ESH网络的理想选择[J].中国多媒体通信,2006(8):33-33.

IEEE/CAA Journal of Automatica Sinica

2014年第2期

浏览历史

内容加载中请稍等...

Timesharing-tracking Framework for Decentralized Reinforcement Learning in Fully Cooperative Multi-agent System

相关作者

相关机构

相关主题

浏览历史