基于动态贝叶斯网络的可分解信念状态空间压缩算法被引量：2

Factored Belief States Space Compression Algorithm Based on Dynamic Bayesian Network

导出

摘要针对部分可观察马尔可夫决策过程(POMDP)的信念状态空间规模"维数灾"问题,根据信念状态变量存在可分解和独立关系的特性,提出一种基于动态贝叶斯网络(DBN)的可分解信念状态空间压缩算法(factoredbelief states space compression,FBSSC).该算法通过构建变量间依赖关系图,根据独立关系检验去除多余边,将转移函数联合概率分解成若干个条件概率的乘积,实现信念状态空间的无损压缩.对比实验和RoboCupRescue仿真结果表明,本文算法具有较低误差率、较高收敛性和普遍适用性等特性. For the dimensionality curse problem of belief state space scale of partially observable Markov decision pro- cess （POMDP）, a factored belief states space compression （FBSSC） algorithm based on dynamic Bayesian network （DBN） is proposed according to the decomposable features and dependent relationship of the belief state variables. Based on the building of the graph of dependent relationship among variables, the algorithm removes the redundant edges by detecting the dependent relationships, and decomposes the joint probability of transition function into the product of several conditional probabilities, which realizes the lossless compression of belief states space. Comparison experiments and RoboCupRes. cue simulation results show that the algorithm has the characteristics of lower error rate, higher convergence, and general applicability.

作者仵博吴敏郑红燕冯延蓬

机构地区中南大学信息科学与工程学院先进控制与智能自动化湖南省工程实验室深圳职业技术学院教育技术与信息中心

出处《信息与控制》 CSCD 北大核心 2012年第6期713-719,共7页 Information and Control

基金国家自然科学基金资助项目(61074058 60874042) 广东省自然科学基金资助项目(S2011040004769)

关键词马尔可夫决策过程动态贝叶斯网络维数灾信念状态空间条件独立 MDP （Markov decision process） DBN （dynamic Bayesian network） curse of dimensionality belief statesspace conditional independence

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献12

1Littman M L. A tutorial on partially observable Markov decision processes[J]. Journal of Mathematical Psychology, 2009, 53(3): 119-125.
2Ross S, Pineau J, Chaibdraa B, et al. A Bayesian approach for learning and planning in partially observable Markov deci- sion processes[J]. Journal of Machine Learning Research, 2011, 12(2): 1729-1770.
3Kaelbling L P, Littman M L, Cassandra A R. Planning and act- ing in partially observable stochastic domains[J]. Artificial In- telligence, 1998, 101(2): 99-134.
4Roy N, Gordon G. Finding approximate POMDP solutions through belief compression[J]. Journal of Artificial Intelligence Research, 2005, 23(9): 1-40.
5Li X, Cheung W K, Liu J M. Improving POMDP tractabil- ity via belief compression and clustering[J]. IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics, 2010, 40(1): 125-136.
6Lee D D, Seung H S. Learning the parts of objects by non- negative matrix factorization[J]. Nature, 1999, 401(10): 788- 791.
7Sun J G, Crow M, Fyfe C. Extending metric multidimen- sional scaling with Bergman divergences[J]. Pattern Recogni- tion, 2011, 44(5): 1137-1154.
8仵博,吴敏.一种基于信念状态压缩的实时POMDP算法[J].控制与决策,2007,22(12):1417-1420. 被引量：6
9Roy N. Finding approximate POMDP solutions through belief compression[D]. Pittsburgh, USA: Carnegie Mellon University, 2003.
10Paquet S. Distributed decision-making and task coordination in dynamic uncertain and real-time multiagent environments[D]. Quebec, Canada: Laval University, 2006.

二级参考文献8

1S'ebastien Paquet, Ludovic Tobin, Brahim Chaibdraa. An online POMDP algorithm for complex multi-agent environment[C]. Proc of the 4th Int Joint Conf on Autonomous Agents and Multi Agent Systems (AAMAS-05). Netherlands: Utrecht University, 2005: 970-977.
2Nicolas Meuleau, Leonid et al. Learning nite-state Peshkin, Kee-Eung Kim, controllers for partially observable environments[C]. Proc of the 5th Conf on Uncertainty in Artificial Intelligence. Stockholm, 1999: 427-436.
3Leslie Pack Kaelbling, Michael L Littman, Anthony R Cassandra. Planning and acting in partially observable stochastic domains [J]. Artificial Intelligence, 1998, 101(1): 99-134.
4Pineau J, Gordon G, Thrun S. Point-based value iteration: An anytime algorithm for POMDPs[C]. Int Joint Conf on Artificial Intelligence (IJCAI). Mexico:Acapulco, 2003: 1025-1032.
5Darius Braziunas, Craig Boutilier. Stochastic local search for POMDP controllers[C]. Proc of the 19th National Conf on Artificial Intelligence. CA :San Jose, 2004: 690-696.
6Poupart P. Exploiting structure to efficiently solve large scale partially observable Markov decision processes[D]. Toronto: University of Toronto, 2005.
7Cassandra A R. Exact and approximate algorithms for partially observable Markov decision processes [D]. Rhode Island: Brown University, 1998.
8Eric A Hansen, Zhengzhu Feng. Dynamic programming for POMDPs using a factored state representation[C]. 5th Int Conf on Artificial Intelligence Planning and Scheduling. Colorado :Breckenridge, 2000: 130-139.

共引文献5

1肖国宝,严宣辉.一种动态不确定环境中机器人路径规划方法[J].计算机系统应用,2012,21(4):92-98. 被引量：5
2郑延斌,郭凌云,刘晶晶.多智能体系统分散式通信决策研究[J].计算机应用,2012,32(10):2875-2878. 被引量：3
3郑红燕,仵博,冯延蓬,孟宪军.基于信念点裁剪策略树的POMDP求解算法[J].信息与控制,2013,42(1):53-57. 被引量：1
4仵博,吴敏.基于Monte Carlo粒子滤波的POMDPs在线算法[J].控制与决策,2013,28(6):925-929. 被引量：1
5仵博,陈鑫,郑红燕,冯延蓬.基于非负矩阵分解更新规则的部分可观察马尔可夫决策过程信念状态空间降维算法[J].电子与信息学报,2013,35(12):2901-2907. 被引量：1

同被引文献19

1Yang Q. An introduction to transfer learning[ C ]//Proceedings of 4th Advanced Data Mining and Applications International Conference. Pisca- taway, N.I, USA. [EEE, 2008.1-10.
2Pan S, Yang Q. A survey on transfer learning[ J]. IEEE Transactions oil Knowledge and Data Engineering, 2009, 22(10) . 1345 - 1359.
3Kanji T, Katsumi W. Overestimation and underestimation in learning and transfer[ C l//Proceedings of 2011 International Conference on Bio- metrics and Kansei Engineering. Piscataway, NJ, USA. IEEE, 2011.81 -86.
4Huang P P, Wang G, Shi Y. Boosting for transfer learning from multiple data sources[ J]. Pattern Recognition Letters, 2012, 33 (5) . 568 - 579.
5Selen U, Jaime C. Feature selection for transfer learning[ M ]. Lecture Notes in Computer Science. vol. 6913. Berlin, Germany. Springer-Ver- lag, 2011.430-442.
6Hebah E, Sabih M. Multi-model transfer learning with RULES tamily[ M]//Machine Learning and Data Mining in Pattern Recognition. Ber- lin, Germany. Springer, 2013. 42- 56.
7Dou D J, Qin H, Liu H S. Semantic translation tbr role-based kmwledge in data mining[ M]. Lecture Notes in Computer Science. vol. 6861. Germany. Springer-Verlag, 201 1. 74-89.
8Wang Y, Zheng H Q, Yang L. Berth allocation in container terminal based on association roles[ J ]. Journal of Convergence Information Tech-nology, 2012, 20(7). 266-273.
9Brouard C, Vrain C, Dubois J. Learning a Markov logic network for supervised gene regulatory network inference [ J ]. BMC Bioinformaties, 2013, 14(1) . 273 -286.
10Zheng V W, Xiang E W, Yang Q, et al. Transferring localization models over time[ C ]//Proceedings of 23th AAAI Conference on Artificial Intelligence. Menlo Park, USA. AAAI, 2008.1421-1426.

引证文献2

1张倩,李海港,李明,程玉虎.基于马尔可夫逻辑网的关联规则迁移学习[J].信息与控制,2014,43(6):715-721. 被引量：2
2魏连锁,胡现成,蔡绍滨,李丽丽.基于信念状态空间的水声传感器网络MAC协议[J].东北石油大学学报,2017,41(6):112-118. 被引量：1

二级引证文献3

1尚文利,乔全胜,万明,曾鹏.工业防火墙规则生成与优化的自学习方法[J].计算机工程与设计,2016,37(7):1752-1756. 被引量：5
2姜海燕,刘昊天,舒欣,徐彦,伍艳莲,郭小清.基于最大均值差异的多标记迁移学习算法[J].信息与控制,2016,45(4):463-470. 被引量：10
3魏连锁,马敬云,郭媛.压缩感知下最短路径的无线网络数据收集算法[J].东北石油大学学报,2022,46(3):98-106. 被引量：1

1魏唯,欧阳丹彤,吕帅.基于缩减信念状态的Conformant规划方法[J].软件学报,2013,24(7):1557-1570. 被引量：1
2仵博,吴敏.基于Monte Carlo粒子滤波的POMDPs在线算法[J].控制与决策,2013,28(6):925-929. 被引量：1
3蒋昌驰.计算机数据库安全管理措施分析[J].信息系统工程,2014(5):44-44. 被引量：5
4郑红燕,仵博,冯延蓬,孟宪军.基于信念点裁剪策略树的POMDP求解算法[J].信息与控制,2013,42(1):53-57. 被引量：1
5汪春峰,靳利.贝叶斯网络结构学习的蜂群算法[J].小型微型计算机系统,2014,35(6):1417-1421. 被引量：3
6仵博,吴敏,佘锦华.基于点的POMDPs在线值迭代算法[J].软件学报,2013,24(1):25-36. 被引量：3
7江虹,刘从彬,伍春.认知无线电网络中提高传输层端到端吞吐率的跨层参数配置[J].物理学报,2013,62(3):486-493. 被引量：5
8郑延斌,郭凌云,刘晶晶.多智能体系统分散式通信决策研究[J].计算机应用,2012,32(10):2875-2878. 被引量：3
9仵博,吴敏.一种基于信念状态压缩的实时POMDP算法[J].控制与决策,2007,22(12):1417-1420. 被引量：6
10万开方,高晓光,李波,梅军峰.基于部分可观察马尔可夫决策过程的多被动传感器组网协同反隐身探测任务规划[J].兵工学报,2015,36(4):731-743. 被引量：12

信息与控制

2012年第6期

浏览历史

内容加载中请稍等...

基于动态贝叶斯网络的可分解信念状态空间压缩算法被引量：2

参考文献12

二级参考文献8

共引文献5

同被引文献19

引证文献2

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于动态贝叶斯网络的可分解信念状态空间压缩算法 被引量：2

参考文献12

二级参考文献8

共引文献5

同被引文献19

引证文献2

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于动态贝叶斯网络的可分解信念状态空间压缩算法被引量：2