期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
基于半自治agent的profit-sharing增强学习方法研究 被引量:3
1
作者 杨克巍 张少丁 +1 位作者 岑凯辉 谭跃进 《计算机工程与应用》 CSCD 北大核心 2007年第15期72-75,97,共5页
在基于半自治agent的系统中应用profit-sharing增强学习方法,并与基于动态规划的Q-learning增强学习方法进行比较,在不确定因素较多的动态环境中,当系统状态变化不是一个马尔科夫过程时profit-sharing方法具有很大优势。根据半自治agen... 在基于半自治agent的系统中应用profit-sharing增强学习方法,并与基于动态规划的Q-learning增强学习方法进行比较,在不确定因素较多的动态环境中,当系统状态变化不是一个马尔科夫过程时profit-sharing方法具有很大优势。根据半自治agent中半自治的特性——受制性,提出了一种面向基于半自治agent的增强学习模型,以战场仿真中安全隐蔽的寻找模型为实例对基于半自治agent的profit-sharing增强学习模型进行了试验分析。 展开更多
关键词 增强学习 半自治agent profit-sharing Q-LEARNING
下载PDF
A Policy-Improving System for Adaptability to Dynamic Environments Using Mixture Probability and Clustering Distribution 被引量:1
2
作者 Uthai Phommasak Daisuke Kitakoshi +1 位作者 Jun Mao Hiroyuki Shioya 《Journal of Computer and Communications》 2014年第4期210-219,共10页
Along with the increasing need for rescue robots in disasters such as earthquakes and tsunami, there is an urgent need to develop robotics software for learning and adapting to any environment. A reinforcement learnin... Along with the increasing need for rescue robots in disasters such as earthquakes and tsunami, there is an urgent need to develop robotics software for learning and adapting to any environment. A reinforcement learning (RL) system that improves agents’ policies for dynamic environments by using a mixture model of Bayesian networks has been proposed, and is effective in quickly adapting to a changing environment. However, the increase in computational complexity requires the use of a high-performance computer for simulated experiments and in the case of limited calculation resources, it becomes necessary to control the computational complexity. In this study, we used an RL profit-sharing method for the agent to learn its policy, and introduced a mixture probability into the RL system to recognize changes in the environment and appropriately improve the agent’s policy to adjust to a changing environment. We also introduced a clustering distribution that enables a smaller, suitable selection, while maintaining a variety of mixture probability elements in order to reduce the computational complexity and simultaneously maintain the system’s performance. Using our proposed system, the agent successfully learned the policy and efficiently adjusted to the changing environment. Finally, control of the computational complexity was effective, and the decline in effectiveness of the policy improvement was controlled by using our proposed system. 展开更多
关键词 REINFORCEMENT Learning profit-sharing Method MIXTURE PROBABILITY CLUSTERING
下载PDF
A Reinforcement Learning System to Dynamic Movement and Multi-Layer Environments
3
作者 Uthai Phommasak Daisuke Kitakoshi +1 位作者 Hiroyuki Shioya Junji Maeda 《Journal of Intelligent Learning Systems and Applications》 2014年第4期176-185,共10页
There are many proposed policy-improving systems of Reinforcement Learning (RL) agents which are effective in quickly adapting to environmental change by using many statistical methods, such as mixture model of Bayesi... There are many proposed policy-improving systems of Reinforcement Learning (RL) agents which are effective in quickly adapting to environmental change by using many statistical methods, such as mixture model of Bayesian Networks, Mixture Probability and Clustering Distribution, etc. However such methods give rise to the increase of the computational complexity. For another method, the adaptation performance to more complex environments such as multi-layer environments is required. In this study, we used profit-sharing method for the agent to learn its policy, and added a mixture probability into the RL system to recognize changes in the environment and appropriately improve the agent’s policy to adjust to the changing environment. We also introduced a clustering that enables a smaller, suitable selection in order to reduce the computational complexity and simultaneously maintain the system’s performance. The results of experiments presented that the agent successfully learned the policy and efficiently adjusted to the changing in multi-layer environment. Finally, the computational complexity and the decline in effectiveness of the policy improvement were controlled by using our proposed system. 展开更多
关键词 REINFORCEMENT Learning profit-sharing Method MIXTURE PROBABILITY CLUSTERING
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部