基于拓扑信息加速马尔科夫毯学习被引量：1

Accelerating the Recovery of Markov Blanket Using Topology Information

下载PDF

导出

摘要目标变量的马尔科夫毯(MB)是用于预测其状态的最优特征子集。提出一种新的约束学习类MB推导算法FSMB,它遵循后向选择的搜索策略,并依赖条件独立(CI)测试删除任意结点对之间的伪连接。与传统约束学习类算法不同,FSMB能从已执行的CI测试推导出不同结点扮演d-分割(d-separation)结点的优先等级;而后基于该信息在未来优先执行条件集中包含高优先级结点的CI测试,从而更快速地判断并删除伪连接边。该策略可帮助快速缩小搜索空间,从而大大提升学习效率。基于仿真网络的实验研究显示,FSMB在计算效率上较经典的PCMB和IPC-MB有显著的提升,而学习效果相当;在面对较大网络结构时(比如100和200个结点),甚至比公认最快速的IAMB还节省近40%的计算量,但学习效果要远优于IAMB。基于16个UCI数据集和4个经典的分类模型的实验显示,基于FSMB输出的特征集合所训练模型的分类准确率普遍接近或高于基于原有特征全集训练所得模型。因此,FSMB是快速且有效的MB推导算法。 Markov blanket（MB） has been known as the optimal feature subset for prediction, and there exist fertile works to induce MB by local search since 1996. A novel one called FSMB was proposed which heavily relies on condi- tional independence（CI） test to determine the existence of connection between nodes, so it is kind of constraint-based learning as well. However,it differs from previous works by treating candidate CI tests unfairly. FSMB extracts critical d^separation topology information from conducted CI tests, and applies them to sort and perform those more likely to uncover independent relations with priority. Search space therefore is expected to shrink quickly in a more efficient man- ner. Experimental studies indicate that FSMB achieves tremendous improvement over state-obart works PCMB and IPC-MB in term of time efficiency, but with no sacrifice on learning quality. When given large networks（e, g. 100 and 200 nodes）, FSMB runs even more efficiently than IAMB which is recognized as the fastest algorithm by now, requiring up to 40~ fewer CI tests, and produces much higher quality of results. Experiments with UCI data sets and four classi- cal classification models indicate that the classification accuracy of the models trained on the output of FSMB are close to or exceed performance achieved by models trained on all features, hence FSMB is an effective feature subset selector.

作者傅顺开苏致祯 Sein Minn 吕天依

机构地区华侨大学计算机科学与技术学院

出处《计算机科学》 CSCD 北大核心 2015年第B11期42-48,共7页 Computer Science

基金国家自然科学基金资助项目(61305058 61300139) 福建省自然科学基金(2014J05074) 厦门科技计划基金资助项目(3505Z20133027) 华侨大学科研基金资助项目(11Y0274 12HJY18) 中央高校基本科研基金资助项目(11J0263)资助

关键词马尔科夫毯贝叶斯网络局部搜索结构学习约束学习条件独立测试 Markov blanket, Bayesian network, Local search, Structure learning, Constraint-based learning, Conditional independence test

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献19

1Pearl J. Probabilistic reasoning in expert systems[M]. San Ma- tego: Morgan Kaufmann, 1988.
2Koller D,Sahami M. Toward optimal feature selection[C]//the 13th International Conference on Machine Learning(lCMl.). Bari, Italy: Morgan Kaufmann, 1996.
3Chickering D M, Geiger D, Heckerman 13. Learning Bayesian Network is NP-Hard[R]. Microsoft Research, 1994.
4Campos C P D, Zeng Z, Ji Q. Efficient structure learning of Bayesian networks using constraints [J]. Journal of Machine Learning Research(JMLR), 2011,12 ( 11 ) : 663 689.
5Tsamardinos I, Aliferis C, Statnikov A, et al. Algorithms for large scale Markov blanket discovery[C] ff 16th International FLAIRS Conference, 2003. AAAI, 2003.
6Zhang Y, Zhang Z, Liu K, et al. An improved lAMB algorithm for Markov blanket diseovery[J]. Journal of Computers. 2010,5 (11):1755-1761.
7Zhang Y,Xu H,Huang Y,et al. SIAMB algorithm for Markov blanket diseovery[C]//Asia-Pacific Conference on Information Processing(APCIP09). Washington: IEEE Computer Society.
8Tsamardinos h Aliferis C F,Statnikov A. Time and sample effi- cient discovery of Markov blankets and direct causal relations [C]//9th ACM SIGKDI) International Conference on Know- ledge Discovery and Data Mining(KDD). ACM, 2003.
9Aliferis C, Tsamardinos I, Statnikov A. HITON, A Novel Mar- kov Blanket Algorithm for Optimal Variable Selection[C]//An nual Symposium on American Medical Informatics Association (AMIA). 2003 Pena J M,Nilsson R, Bjorkegren J, et al. Towards scalable and z17 .
10Pena J M, Nilsson R, Bjorkegren J, et al. "l'owards scalable anddata efficient learning of Markov boundaries[J]. International Journal of Approximate Reasoning,2007,45(2) :211-2S2.

同被引文献1

1王大荣,张忠占.线性回归模型中变量选择方法综述[J].数理统计与管理,2010,29(4):615-627. 被引量：69

引证文献1

1严曙,胡晓波,王儒敬.基于正则线性模型的马尔科夫边学习算法[J].计算机应用与软件,2020,37(2):182-191. 被引量：1

二级引证文献1

1奚增辉,王卫斌,姚嵘,瞿海妮.大数据分析的商用数据中心用电量预测[J].自动化与仪器仪表,2021(12):53-55.

1李冰寒,高晓利,刘三阳.基于互信息学习贝叶斯网络等价类[J].计算机应用研究,2011,28(1):81-83. 被引量：2
2郭程,张龙军,许钟华.云存储环境下的支持隐私保护方案[J].计算机系统应用,2014,23(11):87-91. 被引量：1
3赵波,吴庆畅,尹世堂,范菁.基于条件独立性测试的贝叶斯网结构学习改进算法[J].云南民族大学学报（自然科学版）,2011,20(5):402-405. 被引量：3
4郭坤,王浩,姚宏亮,李俊照.逻辑回归分析的马尔可夫毯学习算法[J].智能系统学报,2012,7(2):153-160. 被引量：2
5张翀,周经野,韩忠愿.基于自由度分析的装配体约束求解[J].湘潭大学自然科学学报,2006,28(1):41-45. 被引量：1
6WHBW.点点鼠标,给邮件划分优先等级[J].网友世界,2010(18):56-56.
7贾海洋,刘大有,陈娟,关淞元,刘欣.免疫遗传算法学习贝叶斯网等价类[J].吉林大学学报（理学版）,2009,47(1):48-56. 被引量：4
8冀俊忠,阎静,刘椿年.基于I-B&B-MDL的贝叶斯网结构学习改进算法[J].北京工业大学学报,2006,32(5):436-441. 被引量：5
9华北工控PCMB-6688隆重上市[J].测控技术,2009,28(5):100-100.
10柳传志.联想为什么做砸多元化却成功实现了国际化[J].IT时代周刊,2007(21):18-18.

计算机科学

2015年第B11期

浏览历史

内容加载中请稍等...

基于拓扑信息加速马尔科夫毯学习被引量：1

参考文献19

同被引文献1

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于拓扑信息加速马尔科夫毯学习 被引量：1

参考文献19

同被引文献1

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于拓扑信息加速马尔科夫毯学习被引量：1