进化操作行为学习模型及在移动机器人避障上的应用被引量：3

Evolutionary operant behavior learning model and its application to mobile robot obstacle avoidance

下载PDF

导出

摘要针对移动机器人避障上存在的自适应能力较差的问题,结合遗传算法(GA)的进化思想,以自适应启发评价(AHC)学习和操作条件反射(OC)理论为基础,提出了一种基于进化操作行为学习模型(EOBLM)的移动机器人学习避障行为的方法。该方法是一种改进的AHC学习模式,评价单元采用多层前向神经网络来实现,利用TD算法和梯度下降法进行权值更新,这一阶段学习用来生成取向性信息,作为内在动机决定进化的方向;动作选择单元主要用来优化操作行为以实现状态到动作的最佳映射。优化过程分两个阶段来完成,第一阶段通过操作条件反射学习算法得到的信息熵作为个体适应度,执行GA学习算法搜索最优个体;第二阶段由OC学习算法选择最优个体内的最优操作行为,并得到新的信息熵值。通过移动机器人避障仿真实验,结果表明所设计的EOBLM能使机器人通过不断与外界未知环境进行交互主动学会避障的能力,与传统的AHC方法相比其自学习自适应的能力得到加强。 To solve the problem of poor self-adaptive ability in the robot obstacle avoidance,combined with evolution thought of Genetic Algorithm（GA）,an Evolutionary Operant Behavior Learning Model（EOBLM） was proposed for the mobile robot learning obstacle avoidance in unknown environment,which was based on Operant Conditioning（OC） and Adaptive Heuristic Critic（AHC） learning.The proposed model was a modified version of the AHC learning architecture.Adaptive Critic Element（ACE） network was composed of a multi-layer feedforward network and the learning was enhanced by TD（λ） algorithm and gradient descent algorithm.A tropism mechanism was designed in this stage as intrinsic motivation and it could direct the orientation of the Agent learning.Adaptive Selection Element（ASE） network was used to optimize operant behavior to achieve the best mapping from state to actor.The optimizing process has two stages.At the first stage,the information entropy got by OC learning algorithm was used as individual fitness to search the optimal individual with executing the GA learning.At the second stage,the OC learning selected the optimal operation behavior within the optimal individual and got new information entropy.The results of experiments on obstacle avoidance show that the method endows the mobile robot with the capabilities of learning obstacle avoidance actively for path planning through interaction with the environment constantly.The results were compared with the traditional AHC learning algorithm,and the proposed model had better performance on self-learning and self-adaptive abilities.

作者郜园园朱凡宋洪军

机构地区浙江农林大学信息工程学院

出处《计算机应用》 CSCD 北大核心 2013年第8期2283-2288,共6页 journal of Computer Applications

基金浙江省青年科学基金资助项目(LQ13F030012) 浙江农林大学人才启动项目(2013FR023)

关键词移动机器人自适应启发评价操作条件反射遗传算法避障 mobile robot Adaptive Heuristic Critic（AHC） operant conditioning Genetic Algorithm（GA） obstacle avoidance

分类号 TP242 [自动化与计算机技术—检测技术与自动化装置]

引文网络
相关文献

参考文献20

1王志文,郭戈.移动机器人导航技术现状与展望[J].机器人,2003,25(5):470-474. 被引量：109
2FLOREANO D, MONDADA F. Evolutionary neuro-controller for autonomous mobile robots [J]. Neural Networks, 1998, 11(7/8): 1461 - 1478.
3YEN J, PFLUGER N. A fuzzy logic based extension to Payton and Rosenblatt' s command fusion method for mobile robot navigation [J]. IEEE Transactions on Systems, Man and Cybernetics, 1995, 25(6) : 971 -978.
4KERMICHE S, SAIDI M L, ABBASSI H A. Gradient descent ad- justing Takagi-Sugeno controller for a navigation of robot manipulator [ J]. Journal of Engineering and Applied Science, 2006, 1 (1) : 24 - 29.
5JOO ER M, CHANG D. Obstacle avoidance of a mobile robot using hybrid learning approach [ J]. IEEE Transactions on Industrial Elec- tronics, 2005, 52(3): 898-905.
6JOO ER M, ZHOU Y. Automatic generation of fuzzy inference sys- tems via unsupervised learning [ J]. Neural Networks, 2008, 21 (10) : 1556 - 1566.
7BOUBERTAKH H, TADJINE M, GLORENNEC P-Y. A new mo- bile robot navigation method using fuzzy logic and a modified Q- learning algorithm [ J]. Journal of Intelligent & Fuzzy Systems, 2010, 21(1/2): 113-119.
8SUTTON R S, BARTO A G. Reinforcement learning [ M]. London: MIT Press, 1998:1 - 12.
9SU S F, Hsieh S H. Embedding fuzzy mechanisms and knowledge in box-type reinforcement learning controllers [ J]. IEEE Transactions on Systems, Man and Cybernetics: Part B, 2002, 32(5):645 -653.
10ZEYBEK Z. Role of adaptive heuristic criticism in cascade temper- ature control of an industrial tubular furnace [ J]. Applied Thermal Engineering, 2006, 26(2/3) : 152 - 160.

二级参考文献34

1姜若愚,范丰仙.智能机器人传感器的研究述评[J].湖南大学学报（自然科学版）,1994,21(5):96-100. 被引量：1
2王宏,张钹.基于地图的室外移动机器人路径规划与导航系统[J].机器人,1994,16(1):24-29. 被引量：5
3李伟.在未知环境中基于模糊逻辑的移动机器人行为控制[J].控制理论与应用,1996,13(2):153-162. 被引量：16
4吴克河,李为,柳长安,李国栋.双轮驱动式移动机器人动力学控制[J].宇航学报,2006,27(2):272-275. 被引量：12
5马兆青,袁曾任.基于栅格方法的移动机器人实时导航和避障[J].机器人,1996,18(6):344-348. 被引量：91
6石鸿雁,孙昌志,陈冬阳,安跃军.动态环境下自主移动机器人的导航复杂性[J].沈阳工业大学学报,2006,28(5):534-537. 被引量：3
7许巍丽,孙茂相.全方位移动机器人鲁棒控制[J].沈阳工业大学学报,2007,29(3):312-316. 被引量：2
8Urakubo T, Tsuchiya K, Tsujita K. Motion Control of a Two-Wheeled Mobile Robot. Advanced Robotics, 2001, 15(7) : 711-728.
9Kozlowski K, Pazderski D. Stabilization of Two-Wheeled Mobile Ro-bot Using Smooth Control Law: Experiment Study // Proc of the IEEE International Conference on Robotics and Automation. Orlan-do, USA, 2006:3387-3392.
10McFartand D, Bosser T. Intelligent Behavior in Animals and Ro- bots. Cambridge, USA: MIT Press, 1993.

共引文献117

1汪中原.基于5G技术的智能机器人技术[J].电子技术（上海）,2020(3):54-55. 被引量：1
2邬再新,李艳宏,刘涛.多移动机器人路径规划技术的研究现状与展望[J].机械,2008,35(1):1-3. 被引量：11
3刘满禄,张华,胡天链.改进的人工势场法用于移动机器人导航[J].华中科技大学学报（自然科学版）,2008,36(S1):177-180. 被引量：11
4杨鹃,孙华,吴林.模糊神经网络信息融合方法在机器人避障中的应用[J].自动化技术与应用,2005,24(2):22-24. 被引量：13
5张捍东,郑睿,岑豫皖.移动机器人路径规划技术的现状与展望[J].系统仿真学报,2005,17(2):439-443. 被引量：120
6石鸿雁,孙茂相,孙昌志.未知环境下移动机器人路径规划方法[J].沈阳工业大学学报,2005,27(1):63-69. 被引量：10
7段俊花,李孝安.基于改进遗传算法的机器人路径规划[J].微电子学与计算机,2005,22(1):70-72. 被引量：26
8孙华,杨鹃.多传感器信息融合在移动机器人上的应用[J].工矿自动化,2005,31(2):22-25. 被引量：6
9廖祝华,刘晓平,刘松林.曲线形套装路径规划算法研究[J].计算机工程与应用,2005,41(17):116-119.
10杨放琼 ,谭青 ,彭高明 ,R.A.Willgoss .两轮驱动移动机器人系统误差分析及校正[J].现代机械,2005(4):6-8. 被引量：1

同被引文献35

1盖庆书,白雪.基于神经网络模型的信息融合技术[J].华北水利水电学院学报,2009,30(1):67-69. 被引量：2
2丛爽,戴谊.递归神经网络的结构研究[J].计算机应用,2004,24(8):18-20. 被引量：22
3江贵龙,金祥克,胡旭东.基于模糊算法的移动机器人导航[J].机电工程,2006,23(2):53-57. 被引量：9
4赵海文,岳宏,张雅丽,蔡鹤皋.移动机器人分布式超声探测系统设计[J].河北工业大学学报,2006,35(6):5-10. 被引量：2
5Kovacic Z,Bogdan S.模糊控制器设计理论与应用[M].胡玉玲,等译.北京:机械工业出版社,2010.
6PETRIC T, LAJPAH L. Smooth continuous transition between tasks on a kinematic cotatrol level: obstacle avoidance as a control problem [ J]. Robotics and Autonomous Systems, 2013, 61 (9) : 948 - 959.
7KANARACHOS S, KANARACHOS A. Minimum order bang-bang guidance for feedforward obstacle avoidance steering maneuvers of vehicles [ J ]. International Journal of Automotive Technology, 2013, 14(1) : 37 -46.
8RASHID A T, ALIA A, FRASCA M, et al. Path planning with ob- stacle avoidance based on visibility binary tree algorithm [ J]. Ro- botics and Autonomous Systems, 2013, 61(12) : 1440 - 1449.
9MABROUK M H, MCLNNES C R. Solving the potential field local minimum problem using internal agent states [ J]. Robotics and Au-tonomous System, 2008, 56(12) : 1050 - 1060.
10HU Y, ZHANG Q. Multi-robots path planning based on improved artificial potential field method [ J]. Advanced Materials Research, 2012, 562/563/564:937-940.

引证文献3

1姚毅,陈光建,贾金玲.基于模糊神经网络算法的机器人路径规划研究[J].四川理工学院学报（自然科学版）,2014,27(6):30-33. 被引量：10
2彭玉青,李木,张媛媛.基于改进模糊算法的移动机器人避障[J].计算机应用,2015,35(8):2256-2260. 被引量：20
3胡静波,陈定方,吴俊峰,梅杰,李波.基于改进模糊算法的移动机器人自主避障研究[J].自动化与仪表,2018,33(6):48-52. 被引量：11

二级引证文献38

1郭泉成.智能车避障路径规划建模方法概述[J].电子元器件与信息技术,2022,6(8):101-105.
2边潇俊,林达,谢玉姣.基于滑膜控制和SAFNN的鲁棒混沌同步[J].四川理工学院学报（自然科学版）,2015,28(5):44-50. 被引量：2
3容芷君,曹云飞,钟鹏飞,陈奎生.基于进化树的产品模块化粒度分析[J].武汉科技大学学报,2015,38(6):431-435.
4曹莉,唐玲,吴浩,高祥,乐英高.基于改进小波神经网络的短时交通流量预测研究[J].四川理工学院学报（自然科学版）,2015,28(6):52-57. 被引量：1
5彭玉青,李木,高晴晴,张媛媛.基于动态模板匹配的移动机器人目标识别[J].传感技术学报,2016,29(1):58-63. 被引量：16
6谢玉姣,林达,边潇俊.具有死区输入的不确定混沌系统控制[J].重庆理工大学学报（自然科学）,2016,30(4):120-126.
7鲁红权,刘秋鹤,葛常建,史涛.基于模糊算法的双足机器人路径规划[J].电子技术与软件工程,2016(12):117-119. 被引量：1
8张倩,薛志斌.基于模糊控制的仿生机器鱼避障研究[J].青海大学学报（自然科学版）,2016,34(3):78-83.
9周其洪,张佳南,裴泽光,陈革.碳纤维经纱织造损伤状态评价标准及实现方法[J].纺织学报,2016,37(9):134-139. 被引量：1
10刘祖兵,袁亮,蒋伟.基于模糊逻辑的移动机器人避障研究[J].机械设计与制造,2017(3):101-104. 被引量：11

1段勇,崔宝侠,徐心和.进化强化学习及其在机器人路径跟踪中的应用[J].控制与决策,2009,24(4):532-536. 被引量：6
2王帅,李光泽,李宾泽.基于操作条件反射的自主学习型智能系统[J].科技创新导报,2014,11(10):223-223.
3阮晓钢,张晓平,武璇,庞涛.基于学习自动机的具有内发动机的感知运动系统的建立[J].控制与决策,2016,31(2):303-309. 被引量：3
4郜园园,阮晓钢,宋洪军.操作条件反射学习自动机及其在机器人平衡控制中的应用[J].控制与决策,2013,28(6):930-934. 被引量：3
5阮晓钢,陈静.基于滑模思想和Elman网络的操作条件反射学习控制方法[J].控制与决策,2011,26(9):1398-1401. 被引量：3
6吴征宇.网络中心战思微[J].国防科技,2005,26(8):63-66.
7史涛,杨卫东,任红格.轮式机器人鲁棒仿生自主学习算法的研究[J].计算机测量与控制,2014,22(4):1209-1211.
8郜园园,阮晓钢,宋洪军,于建均.一种基于混合学习策略的移动机器人路径规划方法[J].控制与决策,2012,27(12):1822-1827. 被引量：4
9陈功,周谊成,王辉.基于强化学习TD算法的乒乓游戏击球策略优化[J].电脑知识与技术,2011,7(10):6926-6927.
10黄炳强,曹广益,王占全.强化学习原理、算法及应用[J].河北工业大学学报,2006,35(6):34-38. 被引量：19

计算机应用

2013年第8期

浏览历史

内容加载中请稍等...

进化操作行为学习模型及在移动机器人避障上的应用被引量：3

参考文献20

二级参考文献34

共引文献117

同被引文献35

引证文献3

二级引证文献38

相关作者

相关机构

相关主题

浏览历史

进化操作行为学习模型及在移动机器人避障上的应用 被引量：3

参考文献20

二级参考文献34

共引文献117

同被引文献35

引证文献3

二级引证文献38

相关作者

相关机构

相关主题

浏览历史

进化操作行为学习模型及在移动机器人避障上的应用被引量：3