期刊文献+

基于视觉注意机制深度强化学习的行人检测方法 被引量:10

Deep reinforcement learning with visual attention for pedestrian detection
下载PDF
导出
摘要 结合视觉注意机制,并用深度强化学习训练视点选择模型,模拟人类视觉搜索局部关键部位,提出了新的行人检测方法,通过视点选择模型生成聚焦图像,不断地叠加搜索关键区域,由检测网络对关键区域进行行人判别,并通过信息熵度量检测结果的可信度,作为奖赏通过深度强化学习优化视点选择模型。视点选择模型和检测网络协同迭代训练,使该方法具有很强的局部关键区域选择和判别能力,减少了形变和遮挡的影响。与经典的基于部位的行人检测方法在公开的行人检测数据集进行对比实验,结果表明,所提出的行人检测方法可以有效地提高行人检测精度。 The visual attention mechanism is introduced proposes A new pedestrian detection method is proposed,by introducing visual attention mechanism.Meanwhile,the deep reinforcement learning is applied to train the viewpoint selection model to simulate how human visual search local key parts.The method generates focused images by viewpoint selection model to search and overlay them together for key areas.The detecting network is used to identify pedestrian in these key areas.Then,information entropy is computed for measuring the reliability of the result and optimizes the viewpoint selection model as a reward for deep reinforcement learning.The collaborative iterative training with viewpoint selection model and detection network are integrated to improve the ability for searching and detecting local key areas,and reduce the influence for posture changing with body deformation and occlusion.The comparisons with classic part-based pedestrian detection method on public pedestrian detection data sets show that the proposed method can effectively improve the pedestrian detection accuracy.
出处 《中国科技论文》 北大核心 2017年第14期1570-1577,共8页 China Sciencepaper
基金 辽宁省教育厅科学研究一般项目(LYB201616)
关键词 视觉注意 深度强化学习 行人检测 信息熵 深度学习 visual attention deep reinforcement learning pedestrian detection information entropy deep learning
  • 相关文献

参考文献2

二级参考文献15

  • 1Puterman M L.Markov Decision Process:Discrete Dynamic Dtochastic Programming.New-York:Wiley,1994
  • 2Kaya M,Alhajj R.Fuzzy olap association rules mining based modular reinforcement learning approach for multiagent systems.IEEE Transactions on Systems,Man and Cybernetics part B:Cybernetics,2005,35(2):326-338
  • 3Singh S,Bertsekas D.Reinforcement learning for dynamic channel allocation in cellular telephone systems//Mozer M C,Jordan M L,Petsche T.Proceedings of the NIPS-9.Cambridge MA:MIT Press,1997:974
  • 4Vengerov D N,Berenji H R.A fuzzy reinforcement learning approach to power control in wireless transmitters.IEEE Transactions on Systems,Man,and Cybernetics part B:Cybernetics,2005,35(4):768-778
  • 5Critesl R H,Barto A G.Elevator group control using multiple reinforcement learning Agents.Machine Learning,1998,33(2/3):235-262
  • 6Kaelbling L P,Littman M L,Moore A P.Reinforcement learning:A survey.Journal of Artificial Intelligence Research,1996,4:237-285
  • 7Sutton R S,Barto A G.Reinforcement Learning:An Introduction.Cambridge MA:MIT Press,1998
  • 8Schwartz A.A reinforcement learning method for maximizing undiscounted rewards//Huns M N,Singh M P eds.Proceedings of the 10th Annual Conference on Machine Learning.San Francisco:Morgan Kaufmann,1993:298-305
  • 9Tadepalli P,Ok D.Model-based average reward reinforcement learning.Artificial Intelligence,1998,100(1/2):177-224
  • 10Gosavi A.Reinforcement learning for long run average cost.European Journal of Operational Research,2004,155 (3):654-674

共引文献53

同被引文献81

引证文献10

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部