基于遗传算法的Skinner操作条件反射学习模型被引量：3

Skinner operant conditioning learning model based on genetic algorithm

下载PDF

导出

摘要以概率自动机(probabilistic automata,PA)为平台,结合遗传算法(genetic algorithm,GA)的进化思想,设计了反映Skinner操作条件反射(operant conditioning,OC)思想的仿生学习模型,称为基于遗传算法的操作条件反射概率自动机(genetic algorithm-operant conditioning probabilistic automata,GA-OCPA)学习系统。每一次学习尝试之后,首先,学习系统把通过OC学习算法学习得到的信息熵值作为个体适应度;然后,执行遗传算法,搜索最优的个体;最后,再执行OC学习算法学习最优个体内的最优操作行为,以得到新的信息熵值。理论上分析了GA-OCPA学习系统学习算法的收敛性,通过对两轮机器人运动平衡控制的仿真分析,表明设计的GA-OCPA学习系统的学习是一个自动获取知识和提炼的过程,具有高度的自适应能力。 Platform on probabilistic automata and combined with evolution thought of genetic algorithm,this paper constructs a bionic learning model which can reflect the essence of Skinner operant conditioning.The designed learning model is named as genetic algorithm-operant conditioning probabilistic automaton（GA-OCPA） bionic autonomous learning system.After each learning trial,the learning system firstly obtains the information entropy value based on operant conditioning（OC） learning result and uses it as the fitness of individual.And then genetic algorithm is performed based on information entropy value to find the optimal individual.At last,the OC learning algorithm is performed to learn the optimal operant action in optimal individual,and correspondingly a new information entropy value will be obtained.The convergence theorems for the learning algorithm of GA-OCPA bionic learning system is presented,and the simulation analyses in motion balancing control of two-wheeled robot demonstrate that the learning of GA-OCPA bionic learning system is a process of autonomously acquiring and epurating knowledge and has high adaptive ability.

作者蔡建羡阮晓钢

机构地区北京工业大学电子信息与控制工程学院防灾科技学院

出处《系统工程与电子技术》 EI CSCD 北大核心 2011年第6期1370-1376,共7页 Systems Engineering and Electronics

基金国家自然科学基金(60774077) 国家高技术研究发展计划(863计划)(2007AA04Z226) 北京市教委重点项目(KZ200810005002)资助课题

关键词操作条件反射遗传算法概率自动机运动平衡控制 operant conditioning genetic algorithm probabilistic automata motion balancing control

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献12

1Yilddirim S. Design of adaptive robot control system using recur rent neural network[J]. Journal of lntelligent & Robotic Systems,2005,44(3) :247 - 261.
2Floreano D, Mondada F. Evolutionary neuro-controller for autonomous mobile robots[J]. Neural Networks ,1998,11(7 - 8):1461 - 1478.
3Holocombe W M L. Algebraic automata theory[M].London: Cambridge University Press, 1982 : 25 - 42.
4Sutton R S, Barto A G. Reinforcement Learning[M]. London: MIT Press,1998:1 - 12.
5Kondo T, ho K. A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robots control [J]. Robotics and Autonomous Systems ,2004,46(2) : 111 - 124.
6Skinner B F. The behavior of organisms [M]. New York: Appleton Century Crofts, 1938:283 - 286.
7Skinner B F. Two types of conditioned reflex and a pseudo type[J]. J ournal of General Psychology, 1935,12 ( 3 ) : 66 - 77.
8Touretzky D S, Saksida L M. Skinnerbots[C]//Proc. of the Fourth International Conference on Simulation of Adaptive Be havior,1996:285 - 294.
9Saksida I. M, Touretzky D S. Application of a model of instru mental conditioning to mobile robot control[J]. Sensor Fusion and Decentralized Control in Autonomous Robotic Systems, 1997,3209(15) :55 - 66.
10Touretzky D S, Saksida L M. Operanl conditioning in Skinnerbots[J].Adaptive Behavior, 1997,5(3/4) :219-247.

同被引文献28

1蔡兆云,魏海平,任治新.水下地磁导航技术研究综述[J].国防科技,2007,28(3):28-29. 被引量：22
2刘飞,周贤高,杨晔,李士心.相关地磁匹配定位技术[J].中国惯性技术学报,2007,15(1):59-62. 被引量：62
3Stutters L, Liu H, Tiltman C,et al. Navigation technologies forautonomous underwater vehicles [J], IEEE Trans, on Systems,Man,and Cybernetics Part C : Application and Reviews , 2008,38(4): 581 - 589.
4Benhamou S, Sudre J,Bourjea J,et al. The role of geomagneticcues in green turtle open sea navigation[J]. PLoS One,2011,6(10): 266 - 272.
5Lohmann K J. Animal behaviour: magnetic-field perception[J].Nature, 2010,464: 1140- 1142.
6Nehmzow U. Scientific methods in mobile robotics-quantitativeanalysis of agent behaviour[M]. USA:Springer Verlag, 2006.
7Bostrom J E, Akesson S, Alerstam T. Where on earth can ani-mals use a geomagnetic bi-coordinate map for navigation. [J].Ecography, 2012,35: 1039 - 1047.
8Liu M, Liu K, Yang P,et al. Bio-inspired navigation based ongeomagnetic[C] // Proc. of the IEEE International Conferenceon Robotics and Biomimetics , 2013 : 2339 - 2344.
9Liu M,Li H, Liu K. Geomagnetic navigation of AUV without apriori magnetic map[C]// Proc. of the IEEE Oceans , 2014: 1-5.
10Goldenberg F. Geomagnetic navigation beyond the magnetic com-pass[C]// Proc. of the IEEE/ION Plans. 2006: 684 - 694.

引证文献3

1阮晓钢,张晓平,武璇,庞涛.基于学习自动机的具有内发动机的感知运动系统的建立[J].控制与决策,2016,31(2):303-309. 被引量：3
2李红,刘明雍,刘坤.基于信息熵的AUV地磁仿生导航方法[J].系统工程与电子技术,2016,38(6):1390-1394.
3李宗帅,陈静.一种模拟基底神经节机理的自主认知模型[J].系统仿真学报,2018,30(2):427-434.

二级引证文献3

1张晓平,阮晓钢,肖尧,谢瓦达哈,柴洁.基于内发动机机制的移动机器人自主路径规划方法[J].控制与决策,2018,33(9):1605-1611. 被引量：6
2于佳雯,潘伟杰,吕健,付文娟.基于距离认知的虚拟现实指点交互行为研究[J].计算机系统应用,2022,31(3):9-18. 被引量：2
3任红格,吴启隆,史涛.OC学习机制的两轮平衡车模糊自平衡控制[J].机械设计与制造,2023(5):283-286.

1史涛,杨卫东,任红格.轮式机器人鲁棒仿生自主学习算法的研究[J].计算机测量与控制,2014,22(4):1209-1211.
2蔡建羡,阮晓钢.OCPA仿生自主学习系统及在机器人姿态平衡控制上的应用[J].模式识别与人工智能,2011,24(1):138-146. 被引量：5
3魏若岩,阮晓钢,于乃功,黄静,朱晓庆,肖尧.基于Skinner操作条件反射的抽样一致性算法[J].控制与决策,2015,30(2):235-240. 被引量：3
4阮晓钢,蔡建羡.模糊操作条件概率自动机仿生自主学习系统和机器人自平衡控制[J].控制理论与应用,2010,27(7):960-964. 被引量：2
5杨崇耀,戚国正,康家成.关于概率自动机的分解[J].贵州科学,1992,10(1):94-99. 被引量：1
6蔡建羡,阮晓钢.动态FOCPA学习系统设计及在机器人运动平衡控制中的应用[J].信息与控制,2010,39(5):662-672.
7王帅,李光泽,李宾泽.基于操作条件反射的自主学习型智能系统[J].科技创新导报,2014,11(10):223-223.
8任红格,阮晓钢.基于Skinner操作条件反射的两轮机器人自平衡控制[J].控制理论与应用,2010,27(10):1423-1428. 被引量：3
9阮晓钢,蔡建羡,戴丽珍.基于概率自动机的操作条件反射计算模型[J].北京工业大学学报,2010,36(8):1025-1030. 被引量：3
10阮晓钢,戴丽珍,于乃功,于建均.一种自治操作条件反射自动机[J].控制理论与应用,2012,29(11):1452-1457. 被引量：2

系统工程与电子技术

2011年第6期

浏览历史

内容加载中请稍等...

基于遗传算法的Skinner操作条件反射学习模型被引量：3

参考文献12

同被引文献28

引证文献3

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于遗传算法的Skinner操作条件反射学习模型 被引量：3

参考文献12

同被引文献28

引证文献3

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于遗传算法的Skinner操作条件反射学习模型被引量：3