一种基于多Agent强化学习的多星协同任务规划算法被引量：21

An Algorithm of Cooperative Multiple Satellites Mission Planning Based on Multi-agent Reinforcement Learning

下载PDF

导出

摘要在分析任务特点和卫星约束的基础上给出了多星协同任务规划问题的数学模型。引入约束惩罚算子和多星联合惩罚算子对卫星Agent原始的效用值增益函数进行改进,在此基础上提出了一种多卫星Agent强化学习算法以求解多星协同任务分配策略,设计了基于黑板结构的多星交互方式以降低学习交互过程中的通信代价。通过仿真实验及分析证明该方法能够有效解决多星协同任务规划问题。 A multi-satellite cooperative planning problem model was given considering the characteristics of the task requests and satellite constraints.Then the original performance function of each satellite agent was modified by introducing both the constraint punishing operator and the multi-satellite joint punishing operator.Next,a multi-satellite reinforcement learning algorithm（MUSARLA） was proposed to derive the coordinated task allocation strategy.Furthermore,the interaction among multiple satellites was designed based on blackboard architecture to reduce the communication cost while learning.Finally,simulated experiments are carried out which verified the effectiveness of the proposed algorithm.

作者王冲景宁李军王钧陈浩

机构地区国防科技大学电子科学与工程学院

出处《国防科技大学学报》 EI CAS CSCD 北大核心 2011年第1期53-58,共6页 Journal of National University of Defense Technology

基金国家自然科学基金资助项目(60604035) 国家863高技术资助项目(2007AA12020203)

关键词卫星任务规划协同规划多智能体强化学习黑板结构 satellite mission planning cooperative planning multi-agent reinforcement learning blackboard architecture

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献10

1Khatib L, Frank J. Interleaved Observation Execution and Rescheduling on Earth Observing Systems[C]//Proceedings of the 13^th International Conference on Automated Planning and Scheduling, Trento, Italy, 2003.
2Schetter T, Campbell M, Surka D. Multiple Agent-based Autonomy for Satellite Constellatioas[J]. Artificial Intelligence, 2003 (145): 147- 180.
3Cesta A, Ocon J, Rasconi R, et al. Simulating On-board Autonomy in a Multi-agent System with Planning and Sdaeduling[C]//Proceedings of 20^th International Conference on Planning and Scheduling, Toronto, Canada, 2010.
4陈浩,景宁,李军,唐宇.基于外包合同网的自治电磁探测卫星群任务规划[J].宇航学报,2009,30(6):2285-2291. 被引量：11
5Smith R G, Davis R. Frameworks for Cooperation in Distributed Problem Solving [ J ]. IEEE Trans. On Systems, Man, and Cybernetics, 1981, 11 (1): 61-70.
6Modi P J, Shen W, Tambe M, Yokoo M. An Asynchronous Complete Method for Distributed Constraint Optimization[C]//Proceedings of 2^nd Autonomous Agent and Multi-agent System, Melbourne, Australia, 2003.
7Tan M.Multi-agent Reinforcement Learning:Independent vs. Cooperative Agents [ C ]//Proceedings of 10^th International Conference on Machine Learning, Amherst, MA, 1993: 330-337.
8Busoniu L, Schutter B D, Babuska R. learning and Coordination in Dynamic Multiagent Systems[R], Technical Report 05-019, Delft Center for Systems and Control, Delft University of Technology, The Netherlands, 2005.
9Busoniu L, Schutter B D. A Comprehensive Survey of Multiagent Reinforcement Learning[J]. IEEE Trans. Syst. Man, Cyber., 2008, 38(2) : 156- 172.
10Hu J,Wellman M P.Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm [C]//Proceedings of 15^th Interntional Conference on Machine Learning, Madison, WI, 1998:242 -250.

二级参考文献9

1肖正,吴承荣,张世永.多Agent系统合作与协调机制研究综述[J].计算机科学,2007,34(5):139-143. 被引量：16
2张正强,谭跃进,王军民.基于MAS的分布式卫星系统任务规划研究[J].系统仿真学报,2007,19(12):2868-2871. 被引量：12
3Scott C, Spencer D. Optimal reconfiguration of satellites in formation [J]. Journal of Spacecraft and Revokers, 2007, 44(1): 230- 239.
4Verthillie G, Lenkaitre M. Tutorial on planning activities for earth watching and observation satellites and constellations: from off-line ground planning to on-line on-board planning [ C ]. Proceedings of ICAPS-06, Cambria, UK, 2006.
5Khatib L, Frank J, et al. Interleaved observation execution and rescheduling on earth observing systems[ C]//the Proceedings of the 13th International Conference on Automated Planning and Scheduling, Trento, Italy, 2003.
6Damiani S, Yerfaillie G, et al. An earth watching satellite constellation : how to manage a team of watching agents with limited communications[ C]//the Proceedings of the 4th International Joint Conference on Autonomous Agents and Multi - Agent Systems, Utrecht, Netherlands, 2005.
7Das S, W Curt, Truszkowski W. Distributed intelligent planning and scheduling for enhanced spacecraft autonomy [ C ]//the Proceedings of the AAAI 2001 Spring Symposium Series, California, USA, 2001.
8Schetter T, Campbell M, Surka D. Multiple agent-based autonomy for satellite constellations [ J ]. Artificial Intelligence, 2003 ( 145 ) : 147- 180.
9Smith G, Davis R. Frameworks for cooperation in distributed problem solving[ J ]. IEEE Transactions on Systems, Man and Cybernetics, 1981, 11(1): 61-70.

共引文献10

1刘子林,柴毅,高黎,巫正中.分布式卫星系统递归式任务分配机制研究[J].计算机应用研究,2014,31(7):1947-1950. 被引量：1
2刘嵩,陈英武.敏捷成像卫星自主规划模型与算法[J].国防科技大学学报,2015,37(6):96-102. 被引量：4
3刘嵩,陈英武,邢立宁,孙凯.敏捷成像卫星自主任务规划方法[J].计算机集成制造系统,2016,22(4):928-934. 被引量：17
4郭超,熊伟,刘呈祥.合同网协议改进研究现状与展望[J].装备学院学报,2016,27(6):82-89. 被引量：8
5张超,李玉庆,冯小恩,唐梦莹,江飞龙,王日新,徐敏强.星群观测任务自主规划的星地联合运行机制[J].哈尔滨工业大学学报,2018,50(4):56-61. 被引量：6
6张新,仵倩玉,彭玉.时空谱多星协同任务规划算法研究[J].测绘与空间地理信息,2020,43(1):1-4. 被引量：3
7张聪,袁利,王云鹏,李勇.基于智能聚类的遥感卫星成像任务自主聚合方法[J].空间控制技术与应用,2022,48(5):47-55.
8翟政,何明,徐鹏,彭志新.基于市场机制的无人集群任务分配研究综述[J].计算机应用研究,2023,40(7):1921-1928. 被引量：3
9靳鹏,李健.基于改进合同网协议的多星分布式任务规划[J].无线电工程,2024,54(10):2434-2445.
10Chao ZHANG,Jinyong CHEN,Yanbin LI,Yuqing LI,Weijie CHAI.Satellite group autonomous operation mechanism and planning algorithm for marine target surveillance[J].Chinese Journal of Aeronautics,2019,32(4):991-998. 被引量：2

同被引文献352

1王兴龙,周志成,王典军,陈士明.面向空间近距离操作的机械臂与服务卫星协同控制[J].宇航学报,2020,41(1):101-109. 被引量：7
2耿远卓,郭延宁,李传江,马广富,李文博.敏捷凝视卫星密集点目标聚类与最优观测规划[J].控制与决策,2020,35(3):613-621. 被引量：8
3陈琪锋,吴文昭,戴金海.基于多Agent协商的分布式卫星自主构形保持规划研究[J].宇航学报,2008,29(2):517-521. 被引量：1
4靳肖闪,李军,刘湘辉,郭玉华,景宁.基于拉格朗日松弛与最大分支算法的卫星成像调度算法[J].宇航学报,2008,29(2):694-699. 被引量：19
5杨萍,杨锋,吴斌,黄永宣.用启发式算法和基于冲突的回跳算法求解卫星测控资源调度问题[J].宇航学报,2007,28(6):1609-1613. 被引量：8
6Laura RAY.Hierarchical state-abstracted and socially augmented Q-Learning for reducing complexity in agent-based learning[J].控制理论与应用（英文版）,2011,9(3):440-450. 被引量：2
7贺仁杰,谭跃进.基于约束满足的卫星地面站资源优化分配问题研究[J].计算机工程与应用,2004,40(18):229-232. 被引量：22
8贺仁杰,谭跃进.具有时间窗口约束的并行机床调度问题研究[J].系统工程,2004,22(5):18-22. 被引量：4
9刘洋,贺仁杰,谭跃进.基于约束满足的多卫星调度模型研究[J].系统工程与电子技术,2004,26(8):1076-1079. 被引量：21
10刘洋,代树武,孙辉先.卫星有效载荷的规划与调度[J].航天控制,2004,22(5):73-76. 被引量：6

引证文献21

1郝会成,姜维,李一军,袁子清.基于Multi-Agent敏捷卫星动态任务规划问题[J].国防科技大学学报,2013,35(1):53-59. 被引量：23
2白国庆,邢立宁,贺仁杰,陈英武.基于协同进化的多平台联合对地观测优化调度[J].国防科技大学学报,2013,35(4):182-188. 被引量：10
3姜维,郝会成,李一军.对地观测卫星任务规划问题研究述评[J].系统工程与电子技术,2013,35(9):1878-1885. 被引量：41
4姜维,庞秀丽,郝会成.成像卫星协同任务规划模型与算法[J].系统工程与电子技术,2013,35(10):2093-2101. 被引量：24
5刘子林,柴毅,高黎,巫正中.分布式卫星系统递归式任务分配机制研究[J].计算机应用研究,2014,31(7):1947-1950. 被引量：1
6马磊,张文旭,戴朝华.多机器人系统强化学习研究综述[J].西南交通大学学报,2014,49(6):1032-1044. 被引量：14
7胡楚丽,刘一冬,张翔.一种面向应急任务的天基传感器观测组合方法[J].武汉大学学报（信息科学版）,2016,41(10):1313-1318. 被引量：3
8樊玉莹,归琳,宫博.空间信息网络中的资源映射与调度研究[J].上海师范大学学报（自然科学版）,2017,46(1):104-109. 被引量：1
9张明星,程光权,刘忠,张家铭.多武器协同作战发射时序规划方法[J].指挥与控制学报,2017,3(1):10-18. 被引量：8
10张新,仵倩玉,彭玉.时空谱多星协同任务规划算法研究[J].测绘与空间地理信息,2020,43(1):1-4. 被引量：3

二级引证文献183

1徐雪松,曾智,邵红燕,杨胜杰,李想.基于个体-协同触发强化学习的多机器人行为决策方法[J].仪器仪表学报,2020(5):66-75. 被引量：10
2佘玉成,杨志,王晓宇,王丹丹,杨钦宁.基于双向锁定规则的多星任务规划算法[J].南京航空航天大学学报,2022,54(S01):43-47.
3钱兴中.贯彻执行民主集中制应把握的几个问题[J].理论学习（浙江）,2000(2):21-22.
4陈垦,汤斌,许庆文,官成浓,梁坚.胃癌病人血清3种细胞因子测定及其临床意义探讨[J].实用癌症杂志,2000,15(1):37-39. 被引量：11
5郝会成,姜维,李一军.基于混合遗传算法的敏捷卫星任务规划求解[J].科学技术与工程,2013,21(17):4972-4978. 被引量：10
6庞秀丽,于渤,姜维.成像卫星任务周规划模型与算法[J].国防科技大学学报,2013,35(5):44-51. 被引量：3
7明亮.雷达系统的任务推演技术[J].现代雷达,2013,35(12):18-20. 被引量：2
8姜维,庞秀丽.提高卫星服务寿命的任务规划方法研究[J].自动化学报,2014,40(5):909-920. 被引量：4
9姜维,庞秀丽.面向成像卫星组网的群任务规划方法研究[J].系统工程理论与实践,2014,34(8):2154-2162. 被引量：3
10Huicheng Hao,Wei Jiang,Yijun Li.Improved algorithms to plan missions for agile earth observation satellites[J].Journal of Systems Engineering and Electronics,2014,25(5):811-821. 被引量：3

1连传强,徐昕,吴军,李兆斌.面向资源分配问题的Q-CF多智能体强化学习[J].智能系统学报,2011,6(2):95-100. 被引量：1
2韩伟,张学庆.一种基于离散粒子群的多星任务规划算法[J].无线电工程,2015,45(1):1-4. 被引量：12
3张文旭,马磊,王晓东.基于事件驱动的多智能体强化学习研究[J].智能系统学报,2017,12(1):82-87. 被引量：9
4程晓北,顾国昌.多智能体分层强化学习研究进展[J].边疆经济与文化,2007(5):73-75.
5程思微,陈克伟,张辉,陈璟.基于MAS技术的多星协同侦察任务规划系统设计[J].计算机应用研究,2009,26(3):956-958. 被引量：1
6宋炯,金钊.采用多智能体强化学习的交通信号优化控制[J].制造业自动化,2012,34(17):13-16. 被引量：1
7这块希捷硬盘归谁管——联强的“序列号”、伟仕的“易碎贴”[J].微型计算机,2009(13):121-121.
8罗青,李智军,吕恬生.复杂环境中的多智能体强化学习[J].上海交通大学学报,2002,36(3):302-305. 被引量：8
9乔阳,唐昊,程文娟,江琦,马学森.一种基于多Agent强化学习的无线传感器网络多路径路由协议[J].合肥工业大学学报（自然科学版）,2016,39(7):896-899. 被引量：7
10郭凌云.多Agent强化学习方法与应用[J].福建电脑,2015,31(5):92-93.

国防科技大学学报

2011年第1期

浏览历史

内容加载中请稍等...

一种基于多Agent强化学习的多星协同任务规划算法被引量：21

参考文献10

二级参考文献9

共引文献10

同被引文献352

引证文献21

二级引证文献183

相关作者

相关机构

相关主题

浏览历史

一种基于多Agent强化学习的多星协同任务规划算法 被引量：21

参考文献10

二级参考文献9

共引文献10

同被引文献352

引证文献21

二级引证文献183

相关作者

相关机构

相关主题

浏览历史

一种基于多Agent强化学习的多星协同任务规划算法被引量：21