面向多机器人路径规划的一种基于模糊模型的再励函数结构被引量：3

Multi-robot path planning-oriented and fuzzy model-based reinforcement function structures

下载PDF

导出

摘要再励学习 ,作为一种新兴的智能学习模式 ,由于学习机制简单 ,不需要任何先验知识 ,也不需要样本数据 ,被越来越多地用于未知环境模型系统的学习。而目前再励学习存在的问题之一是学习速度不高 ,难以保证系统的实时性。在已有的再励学习系统中 ,再励函数多采用无模型表示结构 ,这种结构过于简单粗糙 ,也是再励学习学习效率低下的主要原因之一。因此 ,本文结合多机器人协调避障路径规划问题 ,提出一种新的基于模糊模型的再励函数结构 ,这种结构将反映机器人基本行为如躲避障碍物、其它机器人和趋向目标等的再励函数子函数进行分层建模 ,并取模糊加权和来表示总的再励函数。仿真试验表明 ,使用基于模糊模型的再励函数结构使再励学习的收敛速度要高于无模型结构。 As a newly rising intelligent learning mode, reinforcement learning is being applied more and more in a learning system with unknown environment model because of its simple learning mechanism and no need of knowledge of the system or sample data in advance. However, one of the problems of the reinforcement learning method is that its learning speed is too low to ensure the real_time system. Researchers have studied to speed up learning by improving learning algorithm and adopting intelligent exploration policy or applying the hierarchical reinforcement learning method, etc. However, how to describe the reinforcement function and how the reinforcement function affects the learning speed are seldom studied. In the existing reinforcement learning system, the model_free reinforcement function artificially defined is usually used. Its simple and rough expression is one of the causes of the low efficiency of learning. In this article, a new fuzzy model_based reinforcement function structure is presented. It is described according to the actual application in the conflict-free path planning problem of a cooperative multiple mobile robot system. In this system, the robot behaviors are divided into three basic kinds moving to the goal, avoiding obstacles and other robots. Then, the subfunctions reflecting these basic behaviors of robots are hierarchically and fuzzily modeled, and the final reinforcement function is expressed by the sum of fuzzy weighted sub-functions. The fuzzy model based reinforcement function has more accurate expression of the influence of each robot's action on the environment. The simulation shows that using the fuzzy model based reinforcement functions in reinforcement learning algorithm can further speed up the convergence than using model-free reinforcement functions.

作者张芳颜国正林良明

机构地区上海交通大学电子信息学院

出处《光学精密工程》 EI CAS CSCD 2002年第2期148-153,共6页 Optics and Precision Engineering

关键词机器人再励学习再励函数模糊模型避障路径规划 robots reinforcement learning reinforcement function fuzzy model conflict-free path planning

分类号 TP242 [自动化与计算机技术—检测技术与自动化装置]

引文网络
相关文献

参考文献3

1张芳,颜国正,林良明.一种新的CMAC函数逼近器及其再励学习方法[J].上海交通大学学报,2002,36(10):1439-1442. 被引量：3
2高志军,颜国正,丁国清,颜德田,陈忠泽.多机器人协调与合作系统的研究现状和发展[J].光学精密工程,2001,9(2):99-103. 被引量：29
3陈忠泽,颜国正,林良明,蔡弘.一种新的机械手最优轨迹的规划算法[J].光学精密工程,2001,9(3):242-246. 被引量：8

二级参考文献15

1诸静.机器人与控制技术[M].杭州:浙江大学出版社,1992..
2Kaelbling L P, Littman M L, Moore A W. Reinforcement leaming:a survey[J].Journal of Artificial Intelligence Research,1996,(4):237-285.
3Li Chun，Proc 1999 IEEE Int Conf Robotics and Automation，1999年
4李春，机器人，1999年，21卷，7期，720页
5Xu H，Robot Autonomous Syst，1997年，22卷，1期，115页
6He Kezhong，Proc 1996 IEEE Int Conf Industrial Technology，1996年
7A T O Nancy M A M，Proc 1996 IEEE Int Conf Robotics and Automation，1996年
8Cao B，Proc 1994 IEEE Int Conference on Robotics and Automation，1994年
9Jiang K，Proc IEEE Int Workshop on Emerging Tech Factory Automation:Technology for Intelligent Factory，1992年，531页
10诸静，机器人与控制技术，1992年

共引文献37

1富宏亚,邵忠喜,韩振宇.纤维铺放轨迹规划的两种方法及其比较研究[J].材料工程,2009,37(S2):349-353. 被引量：8
2戴毅,颜国正,甘贤海.基于概率方法的多微型机器人分布式定位[J].光学精密工程,2005,13(6):709-714. 被引量：2
3刘海波,顾国昌,沈晶.自治水下机器人心智模型[J].计算机应用,2006,26(3):647-650.
4陈虹舟,秦世引.基于相关机会规划的多机器人觅食过程的参数优化[J].复杂系统与复杂性科学,2005,2(4):67-71.
5李明军,刘铁男,赵秀明,张宓.基于C8051F330和PTR8000的多机器人无线通信系统设计[J].大庆石油学院学报,2006,30(5):81-84. 被引量：3
6王健强,于澎,杜辉.双机协调机器人弧焊系统的设计与实现[J].合肥工业大学学报（自然科学版）,2006,29(12):1534-1536. 被引量：2
7吴艳花,刘兵.多机器人协调越障研究[J].机械与电子,2007,25(3):55-57. 被引量：1
8杨凯,辜承林.组合式形状记忆合金驱动器的轨迹规划与实现[J].华中科技大学学报（自然科学版）,2007,35(10):91-94.
9王海峰.基于模糊推理系统的新型并联机器人路径规划算法[J].机械与电子,2008,26(1):51-54. 被引量：2
10陈震,李长友,邹湘军.农业多机器人系统的支撑技术与研究进展[J].华中农业大学学报,2007,26(6):914-919. 被引量：4

同被引文献22

1Kaelbling L P, Littman M L, Moore A W. Reinforcement leaming:a survey[J].Journal of Artificial Intelligence Research,1996,(4):237-285.
2MACIEJEWSKI A A, KLEIN C A. Obstacle a- voidance for kinematically redundant manipulators in dynamically varying environments[J]. The In- ternational Journal of Robotics Research, 1985;4 (3) :109-117.
3GLASS K, COLBAUGH R, LIM D, etal.. Real- time collision avoidance for redundant manipulators [J]. IEEE Transactions on Robotics and Automa- tion, 1995, 11(3):448-457.
4王俊龙,张国良,羊帆,等.改进人工势场法的机械臂避障路径规划[J/OL].计算机工程与应用,[2012-06-29].http://www.cnki.net/kcms/de-tail/11.2127.TP.20120629.1645.001.html.
5GUO Z Y, HSIA T C. Joint trajectory generation for redundant robots in an environment with obsta- cles [C]. IEEE International Conference on Ro- botics and Automation, Piscataway, N J, USA, 1990: 157-162.
6BROCK O, KHATIB O, VIJI S. Task-consistent obstacle avoidance and motion behavior for mobile manipulation [C]. IEEE International Conference on Robotics and Automation, Piscataway, N J, USA, 2002,1 : 388-393.
7KREUTZ-DELGADO K, LONG M, SERAJI H. Kinematic analysis of 7 DOF anthropomorphic arms [C]. IEEE International Conference on Ro- botics and Automation, Piscatavoay, N J, USA, 1990: 824-830.
8ZHOU Y, WANG B, JIANG L, et al.. A real- time controller development framework for high degrees of freedom systems [C]. IEEE Interna- tional Conference on Mechatronics and Automa- tion, Piscataway, N J, USA, 2012: 291-296.
9Do K D.Coordination control of multiple ellipsoidal agents with collision avoidance and limited sensing ranges[J].Systems & control letters,2012,61 (1):247-257.
10Antony T,Alice Y.Collision avoidance timing analysis of DSRCbased vehicles[J].2010,42 (1):182-195.

引证文献3

1姜力,周扬,孙奎,刘宏.七自由度冗余机械臂避障控制[J].光学精密工程,2013,21(7):1795-1802. 被引量：28
2安凯.空间机械臂运动过程中的碰撞检测方法[J].计算机测量与控制,2014,22(11):3528-3531. 被引量：3
3张芳,颜国正,林良明.一种新的CMAC函数逼近器及其再励学习方法[J].上海交通大学学报,2002,36(10):1439-1442. 被引量：3

二级引证文献34

1高其远,陈丽.基于自运动的冗余机械臂实时避障轨迹规划[J].智能计算机与应用,2022,12(6):116-120. 被引量：2
2王海峰.基于模糊推理系统的新型并联机器人路径规划算法[J].机械与电子,2008,26(1):51-54. 被引量：2
3郎永平.基于Simulink的三相桥式全控整流电路的仿真研究[J].中国科技博览,2014(5):368-369.
4张芳,颜国正,林良明.面向多机器人路径规划的一种基于模糊模型的再励函数结构[J].光学精密工程,2002,10(2):148-153. 被引量：3
5申浩宇,吴洪涛,陈柏,丁力,杨小龙.基于主从任务转化的冗余度机器人避障算法[J].机器人,2014,36(4):425-429. 被引量：26
6张禹,孙奎,张元飞,刘宏,朱万彬.用于机械臂末端感知的激光测距传感器设计[J].机器人,2014,36(5):519-526. 被引量：10
7王明斐,刘丹,海本斋.RBFNN结合自适应边界的机器人手臂轨迹跟踪控制系统设计[J].计算机测量与控制,2015,23(6):1947-1950. 被引量：1
8管小清,常青,梁冠豪,葛卓.一种冗余机械臂的多运动障碍物避障算法[J].计算机测量与控制,2015,23(8):2802-2805. 被引量：4
9申浩宇,吴洪涛,陈柏,严铖,蒋延杰.冗余度双臂机器人协调避障算法[J].农业机械学报,2015,46(9):356-361. 被引量：12
10张付祥,史文军.双臂钻车钻臂与巷道的碰撞检测方法研究[J].河北科技大学学报,2015,36(6):625-632.

1宋一然.基于强化学习的多机器人路径规划方法[J].莆田学院学报,2006,13(2):38-41. 被引量：1
2张芳,颜国正,林良明.基于再励学习的多移动机器人协调避障路径规划方法[J].计算机工程与应用,2003,39(3):80-83. 被引量：3

光学精密工程

2002年第2期

浏览历史

内容加载中请稍等...

面向多机器人路径规划的一种基于模糊模型的再励函数结构被引量：3

参考文献3

二级参考文献15

共引文献37

同被引文献22

引证文献3

二级引证文献34

相关作者

相关机构

相关主题

浏览历史

面向多机器人路径规划的一种基于模糊模型的再励函数结构 被引量：3

参考文献3

二级参考文献15

共引文献37

同被引文献22

引证文献3

二级引证文献34

相关作者

相关机构

相关主题

浏览历史

面向多机器人路径规划的一种基于模糊模型的再励函数结构被引量：3