期刊文献+

进化操作行为学习模型及在移动机器人避障上的应用 被引量:3

Evolutionary operant behavior learning model and its application to mobile robot obstacle avoidance
下载PDF
导出
摘要 针对移动机器人避障上存在的自适应能力较差的问题,结合遗传算法(GA)的进化思想,以自适应启发评价(AHC)学习和操作条件反射(OC)理论为基础,提出了一种基于进化操作行为学习模型(EOBLM)的移动机器人学习避障行为的方法。该方法是一种改进的AHC学习模式,评价单元采用多层前向神经网络来实现,利用TD算法和梯度下降法进行权值更新,这一阶段学习用来生成取向性信息,作为内在动机决定进化的方向;动作选择单元主要用来优化操作行为以实现状态到动作的最佳映射。优化过程分两个阶段来完成,第一阶段通过操作条件反射学习算法得到的信息熵作为个体适应度,执行GA学习算法搜索最优个体;第二阶段由OC学习算法选择最优个体内的最优操作行为,并得到新的信息熵值。通过移动机器人避障仿真实验,结果表明所设计的EOBLM能使机器人通过不断与外界未知环境进行交互主动学会避障的能力,与传统的AHC方法相比其自学习自适应的能力得到加强。 To solve the problem of poor self-adaptive ability in the robot obstacle avoidance,combined with evolution thought of Genetic Algorithm(GA),an Evolutionary Operant Behavior Learning Model(EOBLM) was proposed for the mobile robot learning obstacle avoidance in unknown environment,which was based on Operant Conditioning(OC) and Adaptive Heuristic Critic(AHC) learning.The proposed model was a modified version of the AHC learning architecture.Adaptive Critic Element(ACE) network was composed of a multi-layer feedforward network and the learning was enhanced by TD(λ) algorithm and gradient descent algorithm.A tropism mechanism was designed in this stage as intrinsic motivation and it could direct the orientation of the Agent learning.Adaptive Selection Element(ASE) network was used to optimize operant behavior to achieve the best mapping from state to actor.The optimizing process has two stages.At the first stage,the information entropy got by OC learning algorithm was used as individual fitness to search the optimal individual with executing the GA learning.At the second stage,the OC learning selected the optimal operation behavior within the optimal individual and got new information entropy.The results of experiments on obstacle avoidance show that the method endows the mobile robot with the capabilities of learning obstacle avoidance actively for path planning through interaction with the environment constantly.The results were compared with the traditional AHC learning algorithm,and the proposed model had better performance on self-learning and self-adaptive abilities.
出处 《计算机应用》 CSCD 北大核心 2013年第8期2283-2288,共6页 journal of Computer Applications
基金 浙江省青年科学基金资助项目(LQ13F030012) 浙江农林大学人才启动项目(2013FR023)
关键词 移动机器人 自适应启发评价 操作条件反射 遗传算法 避障 mobile robot Adaptive Heuristic Critic(AHC) operant conditioning Genetic Algorithm(GA) obstacle avoidance
  • 相关文献

参考文献20

  • 1王志文,郭戈.移动机器人导航技术现状与展望[J].机器人,2003,25(5):470-474. 被引量:109
  • 2FLOREANO D, MONDADA F. Evolutionary neuro-controller for autonomous mobile robots [J]. Neural Networks, 1998, 11(7/8): 1461 - 1478.
  • 3YEN J, PFLUGER N. A fuzzy logic based extension to Payton and Rosenblatt' s command fusion method for mobile robot navigation [J]. IEEE Transactions on Systems, Man and Cybernetics, 1995, 25(6) : 971 -978.
  • 4KERMICHE S, SAIDI M L, ABBASSI H A. Gradient descent ad- justing Takagi-Sugeno controller for a navigation of robot manipulator [ J]. Journal of Engineering and Applied Science, 2006, 1 (1) : 24 - 29.
  • 5JOO ER M, CHANG D. Obstacle avoidance of a mobile robot using hybrid learning approach [ J]. IEEE Transactions on Industrial Elec- tronics, 2005, 52(3): 898-905.
  • 6JOO ER M, ZHOU Y. Automatic generation of fuzzy inference sys- tems via unsupervised learning [ J]. Neural Networks, 2008, 21 (10) : 1556 - 1566.
  • 7BOUBERTAKH H, TADJINE M, GLORENNEC P-Y. A new mo- bile robot navigation method using fuzzy logic and a modified Q- learning algorithm [ J]. Journal of Intelligent & Fuzzy Systems, 2010, 21(1/2): 113-119.
  • 8SUTTON R S, BARTO A G. Reinforcement learning [ M]. London: MIT Press, 1998:1 - 12.
  • 9SU S F, Hsieh S H. Embedding fuzzy mechanisms and knowledge in box-type reinforcement learning controllers [ J]. IEEE Transactions on Systems, Man and Cybernetics: Part B, 2002, 32(5):645 -653.
  • 10ZEYBEK Z. Role of adaptive heuristic criticism in cascade temper- ature control of an industrial tubular furnace [ J]. Applied Thermal Engineering, 2006, 26(2/3) : 152 - 160.

二级参考文献34

共引文献117

同被引文献35

  • 1盖庆书,白雪.基于神经网络模型的信息融合技术[J].华北水利水电学院学报,2009,30(1):67-69. 被引量:2
  • 2丛爽,戴谊.递归神经网络的结构研究[J].计算机应用,2004,24(8):18-20. 被引量:22
  • 3江贵龙,金祥克,胡旭东.基于模糊算法的移动机器人导航[J].机电工程,2006,23(2):53-57. 被引量:9
  • 4赵海文,岳宏,张雅丽,蔡鹤皋.移动机器人分布式超声探测系统设计[J].河北工业大学学报,2006,35(6):5-10. 被引量:2
  • 5Kovacic Z,Bogdan S.模糊控制器设计理论与应用[M].胡玉玲,等译.北京:机械工业出版社,2010.
  • 6PETRIC T, LAJPAH L. Smooth continuous transition between tasks on a kinematic cotatrol level: obstacle avoidance as a control problem [ J]. Robotics and Autonomous Systems, 2013, 61 (9) : 948 - 959.
  • 7KANARACHOS S, KANARACHOS A. Minimum order bang-bang guidance for feedforward obstacle avoidance steering maneuvers of vehicles [ J ]. International Journal of Automotive Technology, 2013, 14(1) : 37 -46.
  • 8RASHID A T, ALIA A, FRASCA M, et al. Path planning with ob- stacle avoidance based on visibility binary tree algorithm [ J]. Ro- botics and Autonomous Systems, 2013, 61(12) : 1440 - 1449.
  • 9MABROUK M H, MCLNNES C R. Solving the potential field local minimum problem using internal agent states [ J]. Robotics and Au-tonomous System, 2008, 56(12) : 1050 - 1060.
  • 10HU Y, ZHANG Q. Multi-robots path planning based on improved artificial potential field method [ J]. Advanced Materials Research, 2012, 562/563/564:937-940.

引证文献3

二级引证文献38

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部