期刊文献+

Feature selection for co-training 被引量:2

Feature selection for co-training
下载PDF
导出
摘要 Co-training is a semi-supervised learning method, which employs two complementary learners to label the unlabeled data for each other and to predict the test sample together. Previous studies show that redundant information can help improve the ratio of prediction accuracy between semi-supervised learning methods and supervised learning methods. However, redundant information often practically hurts the performance of learning machines. This paper investigates what redundant features have effect on the semi-supervised learning methods, e.g. co-training, and how to remove the redundant features as well as the irrelevant features. Here, FESCOT (feature selection for co-training) is proposed to improve the generalization performance of co-training with feature selection. Experimental results on artificial and real world data sets show that FESCOT helps to remove irrelevant and redundant features that hurt the performance of the co-training method. Co-training is a semi-supervised learning method, which employs two complementary learners to label the unlabeled data for each other and to predict the test sample together. Previous studies show that redundant information can help improve the ratio of prediction accuracy between semi-supervised learning methods and supervised learning methods. However, redundant information often practically hurts the performance of learning machines. This paper investigates what redundant features have effect on the semi-supervised learning methods, e.g. co-training, and how to remove the redundant features as well as the irrelevant features. Here, FESCOT (feature selection for co-training) is proposed to improve the generalization performance of co-training with feature selection. Experimental results on artificial and real world data sets show that FESCOT helps to remove irrelevant and redundant features that hurt the performance of the co-training method.
出处 《Journal of Shanghai University(English Edition)》 CAS 2008年第1期47-51,共5页 上海大学学报(英文版)
基金 Project supported by the National Natural Science Foundation of China (Grant No.20503015).
关键词 feature selection semi-supervised learning CO-TRAINING feature selection, semi-supervised learning, co-training
  • 相关文献

参考文献11

  • 1Kamal Nigam,Andrew Kachites Mccallum,Sebastian Thrun,Tom Mitchell.Text Classification from Labeled and Unlabeled Documents using EM[J].Machine Learning (-).2000(2-3)
  • 2LI G Z,YANG J,LIU G P,XUE L.Feature selec- tion for multi-class problems using support vector ma- chines[].Proceedings of th Pacific Rim Interna- tional Conference on Artificial Intelligence.2004
  • 3SEEGER M.Learning with labeled and un- labeled data. http://www.dai.ed.ac.uk/seeger/papers.html . 2006
  • 4ZHU X.Semi-Supervised Learning with Graphs[]..2005
  • 5CHAWLA N V,KARAKOULAS G.Learning from labeled and unlabeled data:an empirical study across tech- niques and domains[].Journal of Artificial Intelli- gence Research.2005
  • 6BLUM A,MITCHELL T.Combining labeled and unla- beled data with co-training[].Proceedings of the th Annual Conference on Computational Learning Theory.1998
  • 7GOLDMAN S,ZHOU Y.Enhancing supervised learning with unlabeled data[].Proceedings of the th Inter- national Conference on Machine Learning.2000
  • 8ZHOU Z H,LI M.Semi-supervised regression with co-training[].Proceedings of the th International Joint Conference on Artificial Intelligence(IJCAI‘).2005
  • 9LIU H,Yu L.Toward integrating feature selection algo- rithms for classification and clustering[].IEEE Trans- actions on Knowledge and Data Engineering.2005
  • 10JOACHIMS T.Transductive inference for text classifi- cation using support vector machines[].Proceedings of th International Conference on Machine Learning.1999

同被引文献3

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部