摘要
提出一种多智能体学习算法.用影响图作为 agent 表示工具,给定 agent 的一个初始模型和它的历史行为,在能力、信念和优先学习的基础上来构建新的模型.学习方法是把其它 agent 的历史行为作为训练集,利用神经网络以及决策知识和专家知识来修改影响图中各结点的连接关系.针对与 agent 历史行为不一致的情况,本文把它看成效用函数发生了随机偏差,通过 Markov chain-Monte Carlo 技术进行模拟,实现效用函数的调整.最后利用多机编队协同空战作为例子说明算法的实用性.
This paper proposes a learning algorithm of multi-agents system. The influence diagram is used asa modeling representation tool. Given an initial model of an Agent and its history behavior based on capabilities, preferences and beliefs of the Agent, a new model is constructed. Using observed behavior history of other Agent as training set, the learning method is to modify the connection relation between the nodes in the diagram by using neural network, decision knowledge and expert knowlege approach. The inconsistent behavior is interpreted as random deviations from an underlying utility function in this paper, and the utility function is modified by the Markov chain-Monte Carlo technique. An example of team cooperative air combat shows that this algorithm is effective.
出处
《系统工程学报》
CSCD
北大核心
2008年第3期377-380,共4页
Journal of Systems Engineering
基金
空军国防预研课题资助项目(4020501)
关键词
多智能体
影响图
学习
神经网络
multi-agent systems
influence diagram
learning
neural network