摘要
联盟形成的收益值是模糊和不确定的,难于计算,而联盟收益值在成员变化的情况下的计算就更为复杂。Lerman等人实现了动态联盟Agent进出联盟的管理方法,Chalkiadakis则研究了不确定情况下联盟的再励学习,但没有涉及联盟成员变化情况下的收益值动态性。论文定义了带折扣率的估计核,给出一种再励学习算法来计算联盟成员变化后的收益值,深化了Chalkiadakis的工作。实验结果验证了该方法的可行性和正确性。
It is difficult to compute the value of dynamic coalition because of its fuzzy and uncertain character.It is even more difficult to compute the value while the number of coalition member changes,Lerman implements the management methods for agents joining and leaving coalition.Chalkiadakis investigates Bayesian reinforcement learning for coalition formation under uncertainty,but he has not investigated the value of dynamic coalition with the change of dynamic coalition membership.In this paper an estimate core using discount factor is defined.A reinforcement learning method is proposed to compute the value of dynamic coalition.It improves the work of Chalkiadakis.The experiment result demonstrates that it is feasible and correct.
出处
《计算机工程与应用》
CSCD
北大核心
2006年第6期85-87,共3页
Computer Engineering and Applications
基金
国家自然科学基金重大资助项目(编号:60496323)
山东省教育厅科技计划资助项目(编号:JSJ03J1)
关键词
多AGENT系统
动态联盟形成
再励学习
multi-agent system,dynamic coalition formation,reinforcement learning