Operant conditioning is one of the fundamental mechanisms of animal learning, which suggests that the behavior of all animals, from protists to humans, is guided by its consequences. We present a new stochastic learni...Operant conditioning is one of the fundamental mechanisms of animal learning, which suggests that the behavior of all animals, from protists to humans, is guided by its consequences. We present a new stochastic learning automaton called a Skinner au- tomaton that is a psychological model for formalizing the theory of operant conditioning. We identify animal operant learning with a thermodynamic process, and derive a so-called Skinner algorithm from Monte Carlo method as well as Metropolis algo- rithm and simulated annealing. Under certain conditions, we prove that the Skinner automaton is expedient, 6-optimal, optimal, and that the operant probabilities converge to the set of stable roots with probability of 1. The Skinner automaton enables ma- chines to autonomously learn in an animal-like way.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.61075110,60774077,61375086)the National Basic Research Program of China("973" Project)(Grant No.2012CB720000)+3 种基金the National High-Tech Research and Development Program of China("863" Project)(Grant No.2007AA04Z226)the Beijing Natural Science Foundation(Grant No.4102011)the Key Project of S&T Plan of Beijing Municipal Commission of Education(Grant Nos.KM2008-10005016,KZ201210005001)the Specialized Research Fund for the Doctoral Program of Higher Education(Grant No.20101103110007)
文摘Operant conditioning is one of the fundamental mechanisms of animal learning, which suggests that the behavior of all animals, from protists to humans, is guided by its consequences. We present a new stochastic learning automaton called a Skinner au- tomaton that is a psychological model for formalizing the theory of operant conditioning. We identify animal operant learning with a thermodynamic process, and derive a so-called Skinner algorithm from Monte Carlo method as well as Metropolis algo- rithm and simulated annealing. Under certain conditions, we prove that the Skinner automaton is expedient, 6-optimal, optimal, and that the operant probabilities converge to the set of stable roots with probability of 1. The Skinner automaton enables ma- chines to autonomously learn in an animal-like way.