期刊文献+

鸽子强化学习过程中内部学习状态的动态建模研究 被引量:2

Dynamic Modeling of Internal Cognitive Status of Pigeon in the Process of Reinforcement Learning
下载PDF
导出
摘要 经典Q-learning强化学习模型中学习率为一固定参数,无法有效反映认知学习的动态过程。提出了一种将学习速率表征为时变参数的Q-Learning强化学习模型,给出了利用近期历史行为数据估计阶段性学习速率的方法。为了评估验证该模型的性能,设计了条件刺激与操作行为奖励无关→相关→无关三个阶段动态试验范式,用以观察和分析鸽子在随机强化、固定强化,以及固定强化关系消退等不同条件下的学习行为变化过程,采用动物触屏行为系统完成了3只鸽子颜色刺激-啄屏抉择认知训练,利用训练过程中不同session的行为数据对动态学习率进行了最小二乘估计。分析结果表明:可以获得更小的行为预测误差,误差下降收敛的速度更快,同时学习率的动态变化过程可以有效的反映动物认知行为训练过程中的内在学习状态。 The learning rate in classic Q-learning model is a fixed parameter,which can't reflect the dynamic learning process of agent. So a new Q-Learning model was proposed in which the learning rate is time-varying. To evaluate and verify the performance of this new model,firstly,a three-phase paradigm was designed,in which the relationship between conditioned stimulus and operant behavior varied from unrelated to related and eventually became unrelated. Next,a touch-screen behavioral system of animal was employed to complete decision-making cognitive training of three pigeons. The data from different sessions in the process of training was used to estimate the phased optimal learning rate by means of least squares estimation. The results indicated that Q-learning model of Dynamic learning rate can obtain smaller behavior prediction error,and dynamic process of learning rate can effectively reflect the inherent learning state in the animal cognitive behavioral training process.
出处 《科学技术与工程》 北大核心 2017年第13期120-125,共6页 Science Technology and Engineering
关键词 动态学习率 Q-LEARNING 鸽子 行为 dynamic learning rate Q-learning pigeon behavior
  • 相关文献

参考文献3

二级参考文献25

  • 1柏家林.哺乳动物细胞高效表达系统研究进展[J].西北民族大学学报(自然科学版),2013,34(1):43-51. 被引量:3
  • 2Tang SM, Guo AK. Choice behavior of Drosophila facing contradictory visual cues. Scie rice, 2001,294:1543-1547
  • 3Zhang K, Guo JZ, Peng YQ, Xi W, Guo AK.Dopamine-mushroom body circuit regulates saliency-based decision-making in Drosophila. Science, 2007,316:1901-1904
  • 4Onat A, Kita H, Nishikawa Y. Recurrent neural networks for reinforcement learning: architecture, learning algorithms and internal representation. Neural Networks Proceedings of IEEE International Joint Conference, 1998,3:2010-2015
  • 5Onat A. Q-learning with recurrent neural networks as a controller for the inverted pendulum problem. The Fifth International Conference on Neural Information Processing, 1998,21-23:837-840
  • 6Baird LC. Residual algorithms: reinforcement learning with function approximation. Proceedings of the 12th International Conference on Machine Learning (ICML95), 1995. 30-37
  • 7Richard SS, Andrew GB. Reinforcement learning: introduction. Cambridge, MA: MIT Press, 1998. 6-23
  • 8Watkins CJ, Dayan P. Technical note: Q-learning. Machine Learning, 1992,8:279-292
  • 9Kaelbling LP, Littman ML. Reinforcement learning: a survey. Journal of Artificial Intelligence Research, 1996,4:237-285
  • 10Mills PM, Zomaya AY. Reinforcement learning using back-propagation as a building block neural networks. Proceeding of IEEE International Joint Conference on Neural Network, 1991. 1554-1559

共引文献268

同被引文献10

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部