This paper considers how to find some joint distributions and their marginal distributions of crossing time and renewal numbers related to two PH-renewal processes by constructing an absorbing Markov process.
In this paper, we develop a new theoretical framework by means of the absorbing Markov process theory for analyzing some stochastic global optimization algorithms. Applying the framework to the pure random search, we ...In this paper, we develop a new theoretical framework by means of the absorbing Markov process theory for analyzing some stochastic global optimization algorithms. Applying the framework to the pure random search, we prove that the pure random search converges to the global minimum in probability and its time has geometry distribution. We also analyze the pure adaptive search by this framework and turn out that the pure adaptive search converges to the global minimum in probability and its time has Poisson distribution.展开更多
Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive scheme...Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive schemes of λ value selection addressed to absorbing Markov decision processes was presented and implemented on computers. Results and Conclusion Simulations on the shortest path searching problems show that using adaptive λ in the Q learning based on TTD( λ ) can speed up its convergence.展开更多
Consider a finite absorbing Markov generator, irreducible on the non-absorbing states. PerronFrobenius theory ensures the existence of a corresponding positive eigenvector ψ. The goal of the paper is to give bounds o...Consider a finite absorbing Markov generator, irreducible on the non-absorbing states. PerronFrobenius theory ensures the existence of a corresponding positive eigenvector ψ. The goal of the paper is to give bounds on the amplitude max ψ/ min ψ. Two approaches are proposed: One using a path method and the other one, restricted to the reversible situation, based on spectral estimates. The latter approach is extended to denumerable birth and death processes absorbing at 0 for which infinity is an entrance boundary. The interest of estimating the ratio is the reduction of the quantitative study of convergence to quasi-stationarity to the convergence to equilibrium of related ergodic processes, as seen by Diaconis and Miclo(2014).展开更多
文摘This paper considers how to find some joint distributions and their marginal distributions of crossing time and renewal numbers related to two PH-renewal processes by constructing an absorbing Markov process.
文摘In this paper, we develop a new theoretical framework by means of the absorbing Markov process theory for analyzing some stochastic global optimization algorithms. Applying the framework to the pure random search, we prove that the pure random search converges to the global minimum in probability and its time has geometry distribution. We also analyze the pure adaptive search by this framework and turn out that the pure adaptive search converges to the global minimum in probability and its time has Poisson distribution.
文摘Aim To find a more efficient learning method based on temporal difference learning for delayed reinforcement learning tasks. Methods A kind of Q learning algorithm based on truncated TD( λ ) with adaptive schemes of λ value selection addressed to absorbing Markov decision processes was presented and implemented on computers. Results and Conclusion Simulations on the shortest path searching problems show that using adaptive λ in the Q learning based on TTD( λ ) can speed up its convergence.
基金supported by Agence Nationale de la Recherche(Grant Nos.ANR-11-LABX-0040-CIMIANR-11-IDEX-0002-02 and ANR-12-BS01-0019)
文摘Consider a finite absorbing Markov generator, irreducible on the non-absorbing states. PerronFrobenius theory ensures the existence of a corresponding positive eigenvector ψ. The goal of the paper is to give bounds on the amplitude max ψ/ min ψ. Two approaches are proposed: One using a path method and the other one, restricted to the reversible situation, based on spectral estimates. The latter approach is extended to denumerable birth and death processes absorbing at 0 for which infinity is an entrance boundary. The interest of estimating the ratio is the reduction of the quantitative study of convergence to quasi-stationarity to the convergence to equilibrium of related ergodic processes, as seen by Diaconis and Miclo(2014).