期刊文献+

基于动态贝叶斯网络的可分解信念状态空间压缩算法 被引量:2

Factored Belief States Space Compression Algorithm Based on Dynamic Bayesian Network
原文传递
导出
摘要 针对部分可观察马尔可夫决策过程(POMDP)的信念状态空间规模"维数灾"问题,根据信念状态变量存在可分解和独立关系的特性,提出一种基于动态贝叶斯网络(DBN)的可分解信念状态空间压缩算法(factoredbelief states space compression,FBSSC).该算法通过构建变量间依赖关系图,根据独立关系检验去除多余边,将转移函数联合概率分解成若干个条件概率的乘积,实现信念状态空间的无损压缩.对比实验和RoboCupRescue仿真结果表明,本文算法具有较低误差率、较高收敛性和普遍适用性等特性. For the dimensionality curse problem of belief state space scale of partially observable Markov decision pro- cess (POMDP), a factored belief states space compression (FBSSC) algorithm based on dynamic Bayesian network (DBN) is proposed according to the decomposable features and dependent relationship of the belief state variables. Based on the building of the graph of dependent relationship among variables, the algorithm removes the redundant edges by detecting the dependent relationships, and decomposes the joint probability of transition function into the product of several conditional probabilities, which realizes the lossless compression of belief states space. Comparison experiments and RoboCupRes. cue simulation results show that the algorithm has the characteristics of lower error rate, higher convergence, and general applicability.
出处 《信息与控制》 CSCD 北大核心 2012年第6期713-719,共7页 Information and Control
基金 国家自然科学基金资助项目(61074058 60874042) 广东省自然科学基金资助项目(S2011040004769)
关键词 马尔可夫决策过程 动态贝叶斯网络 维数灾 信念状态空间 条件独立 MDP (Markov decision process) DBN (dynamic Bayesian network) curse of dimensionality belief statesspace conditional independence
  • 相关文献

参考文献12

  • 1Littman M L. A tutorial on partially observable Markov decision processes[J]. Journal of Mathematical Psychology, 2009, 53(3): 119-125.
  • 2Ross S, Pineau J, Chaibdraa B, et al. A Bayesian approach for learning and planning in partially observable Markov deci- sion processes[J]. Journal of Machine Learning Research, 2011, 12(2): 1729-1770.
  • 3Kaelbling L P, Littman M L, Cassandra A R. Planning and act- ing in partially observable stochastic domains[J]. Artificial In- telligence, 1998, 101(2): 99-134.
  • 4Roy N, Gordon G. Finding approximate POMDP solutions through belief compression[J]. Journal of Artificial Intelligence Research, 2005, 23(9): 1-40.
  • 5Li X, Cheung W K, Liu J M. Improving POMDP tractabil- ity via belief compression and clustering[J]. IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics, 2010, 40(1): 125-136.
  • 6Lee D D, Seung H S. Learning the parts of objects by non- negative matrix factorization[J]. Nature, 1999, 401(10): 788- 791.
  • 7Sun J G, Crow M, Fyfe C. Extending metric multidimen- sional scaling with Bergman divergences[J]. Pattern Recogni- tion, 2011, 44(5): 1137-1154.
  • 8仵博,吴敏.一种基于信念状态压缩的实时POMDP算法[J].控制与决策,2007,22(12):1417-1420. 被引量:6
  • 9Roy N. Finding approximate POMDP solutions through belief compression[D]. Pittsburgh, USA: Carnegie Mellon University, 2003.
  • 10Paquet S. Distributed decision-making and task coordination in dynamic uncertain and real-time multiagent environments[D]. Quebec, Canada: Laval University, 2006.

二级参考文献8

  • 1S'ebastien Paquet, Ludovic Tobin, Brahim Chaibdraa. An online POMDP algorithm for complex multi-agent environment[C]. Proc of the 4th Int Joint Conf on Autonomous Agents and Multi Agent Systems (AAMAS-05). Netherlands: Utrecht University, 2005: 970-977.
  • 2Nicolas Meuleau, Leonid et al. Learning nite-state Peshkin, Kee-Eung Kim, controllers for partially observable environments[C]. Proc of the 5th Conf on Uncertainty in Artificial Intelligence. Stockholm, 1999: 427-436.
  • 3Leslie Pack Kaelbling, Michael L Littman, Anthony R Cassandra. Planning and acting in partially observable stochastic domains [J]. Artificial Intelligence, 1998, 101(1): 99-134.
  • 4Pineau J, Gordon G, Thrun S. Point-based value iteration: An anytime algorithm for POMDPs[C]. Int Joint Conf on Artificial Intelligence (IJCAI). Mexico:Acapulco, 2003: 1025-1032.
  • 5Darius Braziunas, Craig Boutilier. Stochastic local search for POMDP controllers[C]. Proc of the 19th National Conf on Artificial Intelligence. CA :San Jose, 2004: 690-696.
  • 6Poupart P. Exploiting structure to efficiently solve large scale partially observable Markov decision processes[D]. Toronto: University of Toronto, 2005.
  • 7Cassandra A R. Exact and approximate algorithms for partially observable Markov decision processes [D]. Rhode Island: Brown University, 1998.
  • 8Eric A Hansen, Zhengzhu Feng. Dynamic programming for POMDPs using a factored state representation[C]. 5th Int Conf on Artificial Intelligence Planning and Scheduling. Colorado :Breckenridge, 2000: 130-139.

共引文献5

同被引文献19

  • 1Yang Q. An introduction to transfer learning[ C ]//Proceedings of 4th Advanced Data Mining and Applications International Conference. Pisca- taway, N.I, USA. [EEE, 2008.1-10.
  • 2Pan S, Yang Q. A survey on transfer learning[ J]. IEEE Transactions oil Knowledge and Data Engineering, 2009, 22(10) . 1345 - 1359.
  • 3Kanji T, Katsumi W. Overestimation and underestimation in learning and transfer[ C l//Proceedings of 2011 International Conference on Bio- metrics and Kansei Engineering. Piscataway, NJ, USA. IEEE, 2011.81 -86.
  • 4Huang P P, Wang G, Shi Y. Boosting for transfer learning from multiple data sources[ J]. Pattern Recognition Letters, 2012, 33 (5) . 568 - 579.
  • 5Selen U, Jaime C. Feature selection for transfer learning[ M ]. Lecture Notes in Computer Science. vol. 6913. Berlin, Germany. Springer-Ver- lag, 2011.430-442.
  • 6Hebah E, Sabih M. Multi-model transfer learning with RULES tamily[ M]//Machine Learning and Data Mining in Pattern Recognition. Ber- lin, Germany. Springer, 2013. 42- 56.
  • 7Dou D J, Qin H, Liu H S. Semantic translation tbr role-based kmwledge in data mining[ M]. Lecture Notes in Computer Science. vol. 6861. Germany. Springer-Verlag, 201 1. 74-89.
  • 8Wang Y, Zheng H Q, Yang L. Berth allocation in container terminal based on association roles[ J ]. Journal of Convergence Information Tech-nology, 2012, 20(7). 266-273.
  • 9Brouard C, Vrain C, Dubois J. Learning a Markov logic network for supervised gene regulatory network inference [ J ]. BMC Bioinformaties, 2013, 14(1) . 273 -286.
  • 10Zheng V W, Xiang E W, Yang Q, et al. Transferring localization models over time[ C ]//Proceedings of 23th AAAI Conference on Artificial Intelligence. Menlo Park, USA. AAAI, 2008.1421-1426.

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部