期刊文献+

基于种子节点选择的网络环境下多标签分类算法研究 被引量:3

Multi-label Classification in Network Environments via Seed Node Selection
下载PDF
导出
摘要 多标签分类在基因分类,药物发现和文本分类等实际问题中有着广泛的应用.已存在的多标签分类算法,通常都是从网络中随机的选取节点作为训练集.然而,在分类算法执行的过程中,网络中不同节点所起的作用不同.在给定训练集数目的情况下,选择的训练集不同,分类精度也会不同.所以我们引入了种子节点的概念,标签分类从种子节点开始,经过不断推理,得到网络中其他所有节点的标签.本文提出了SHDA(Nodes Selection of High Degree from Each Affiliation)算法,即从网络的每个社团中,按比例的选取度数较大的节点,然后将其合并,处理后得到种子节点.真实数据集上的实验表明,将种子节点用作训练集进行多标签分类,能够提升网络环境下多标签分类的准确率. Multi-label classification is widely used in genetic classification,drug discovery and text classification. The existing multi-label classification algorithms usually select nodes randomly from the network as their training set. However,during multi-label classification,different nodes have different effects. Given the number of nodes in the training set,a different training sub-set can lead to different classification accuracy. Hence,we introduce the concept of seed nodes,the classification procedure starts from the seed nodes,and after continuous reasoning,the labels of other nodes are inferred in the network. We propose an SHDA algorithm( Nodes Selection of High Degree from Each Affiliation) in which the nodes of high degrees from each affiliation belonging to the network are selected and merged,and after processing,the seed nodes are obtained. Experiments on several real-world datasets demonstrate that taking seed nodes as the training set to classify multi-labeled data can improve the classification performance.
出处 《电子学报》 EI CAS CSCD 北大核心 2016年第9期2074-2080,共7页 Acta Electronica Sinica
基金 国家重点基础研究发展规划(973计划)项目(No.2013CB329604) 教育部创新团队(No.IRT13059) 国家自然科学基金项目(No.61229301 No.61503114)
关键词 多标签分类 网络 种子节点 推理 社团 multi-label classification network seed nodes
  • 相关文献

参考文献1

二级参考文献24

  • 1Shen X,Boutell M,Luo J,Brown C.Multi-label machine learning and its application to semantic scene classification//Proceedings of the 2004 International Symposium on Electronic Imaging.San Jose,California,USA,2004:18-22.
  • 2Hullermeier E,Furnkranz J,Cheng W,Brinker K.Label ranking by learning pairwise preferences.Artificial Intelligence,2008,172(16):1897-1916.
  • 3Read J.A pruned problem transformation method for multi-label classification//Proceedings of the New Zealand Computer Science Research Student Conference.New Zealand,2008:143-150.
  • 4Tsoumakas G,Vlahavas I.Random k-labelsets:An ensemble method for multilabel classification//Proceedings of the ECML.Warsaw,Poland,2007:406-417.
  • 5Schapire R,Singer Y.BoosTexter:A boosting-based system for text categorization.Machine Learning,2000,39(2):135-168.
  • 6Zhang M,Zhou Z.Multilabel neural networks with applications to functional genomics and text categorization.IEEE Transactions on Knowledge and Data Engineering,2006,18(10):1338-1351.
  • 7Zhang M,Zhou Z.A k-nearest neighbor based algorithm for multi-label classification//Proceedings of the IEEE International Conference on Granular Computing.Beijing,China,2005,2:718-721.
  • 8Clare A,King R.Knowledge discovery in multi-label phenotype data//Proceedings of the ECML/KDD.Freiburg,Germany,2001:42-53.
  • 9Tsoumakas G,Dimou A,Spyromitros E,Mezaris V,Kompatsiaris I,Vlahavas I.Correlation-based pruning of stacked binary relevance models for multi-label learning//Proceedings of the ECML/PKDD.Slovenia,2009:101.
  • 10Page L,Brin S,Motwani R,Winograd T.The pagerank citation ranking:Bringing order to the web//Proceedings of the ASIS.Orlando,FL,1998:161-172.

共引文献56

同被引文献12

引证文献3

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部