摘要
本文针对一类追求系统得益最大化的协作团队的学习问题,基于随机博弈的思想,提出了一种新的多Agent协同强化学习方法。协作团队中的每个Agent通过观察协作相识者的历史行为,依照随机博弈模型预测其行为策略,进而得出最优的联合行为策略。
This paper aims at the learning process of a kind of cooperative teams, which pursue the maximum benefit of a whole system. We propose a new cooperative reinforcement learning method based on the stochastic game in multi-agent systems. Each agent of the team decides its behaviors after forecasting the behavior strategy of acquaintances according to the stochastic game structure and their historical behaviors, and then a jointly optimal behavior strategy is obtained.
出处
《计算机工程与科学》
CSCD
2006年第2期107-110,共4页
Computer Engineering & Science
基金
国家自然科学基金资助项目(70371008)
关键词
强化学习
多AGENT系统
随机博弈
协作
reinforcement learning
multi-agent system
stochastic game
cooperation