期刊文献+

Optimal Policies for Quantum Markov Decision Processes 被引量:2

原文传递
导出
摘要 Markov decision process(MDP)offers a general framework for modelling sequential decision making where outcomes are random.In particular,it serves as a mathematical framework for reinforcement learning.This paper introduces an extension of MDP,namely quantum MDP(q MDP),that can serve as a mathematical model of decision making about quantum systems.We develop dynamic programming algorithms for policy evaluation and finding optimal policies for q MDPs in the case of finite-horizon.The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.
出处 《International Journal of Automation and computing》 EI CSCD 2021年第3期410-421,共12页 国际自动化与计算杂志(英文版)
基金 partly supported by National Key R&D Program of China(No.2018YFA0306701) the Australian Research Council(Nos.DP160101652 and DP180100691) National Natural Science Foundation of China(No.61832015) the Key Research Program of Frontier Sciences,Chinese Academy of Sciences。
  • 相关文献

同被引文献27

引证文献2

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部