期刊文献+

一种基于群体增量学习算法的文本特征选择方法

A Text Feature Selection Method Using the Population Based Incremental Learning Algorithm
原文传递
导出
摘要 尽管目前存在许多文本特征选择方法,但是它们都有着一定的局限性。提出一种新的基于群体增量学习(Population Based Incremental Learning)算法的文本特征选择方法,其特点是无需特征集的先验知识和容易实现,并且由于使用了简单分类器性能作为评价准则,计算复杂度很低。对Reuters-21578文本集的分类实验结果表明,该方法平均分类性能要优于卡方统计量、信息增益和简单遗传算法三种常用的特征选择方法。 At present there are many methods to deal with text feature selection, but each of them has certain disadvantages. A novel text feature selection method using the population based incremental learning algorithm is introduced in this paper. Advantages of the proposed method are that it needs no priori knowledge of features, is easily implemented and its computational complexity is very low due to using a simple classifier. Experimental results obtained from the Reuters - 21578 dataset show that the method is better than chi - square, information gain and genetic algorithm in the performance of text categorization.
出处 《图书情报工作》 CSSCI 北大核心 2011年第24期102-105,125,共5页 Library and Information Service
基金 湖南省自然科学基金项目"电子商务环境下信任演化模型的构建与应用研究"(项目编号:10JJ6111)研究成果之一
关键词 群体增量学习 特征选择 文本分类 遗传算法 population based incremental learning feature selection text categorization genetic algorithm
  • 相关文献

参考文献8

  • 1刘海峰,赵华,刘守生.一种基于位置的改进中文文本特征选择[J].图书情报工作,2009,53(21):102-105. 被引量:3
  • 2Yang Y,Pedersen J O. A comparative study on feature selection in text categorization//Fisher D H. Proceedings of the 14^th International Conference on Machine Learning. Nashville: Morgan Kaufmann Publishers, 1997:412 -420.
  • 3Yu L, Liu H. Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 2004,5 (10) : 1205 - 1224.
  • 4Baluja S, Davis S. Removing the genetics from the standard genetic algorithm//Prieditis A, Russell S J. Proceedings of the InternationalConference on Machine Learning. Tahoe City: Kaufmann Publish- ers,1995:38 -46.
  • 5Mtihlenbein H. The equation for response to selection and its use for prediction. Evolutionary Computation, 1997,5 (3) :303 - 346.
  • 6Ventresca M, Tizhoosh H R. A diversity maintaining population- based incremental learning algorithm. Information Science, 2008, 178(21 ) :4038 -4056.
  • 7Shin K, Abraham A, Han S Y. Improving kNN text categorization by removing outliers from training set//Gelbukh A. Proceedings of the 7th International Conference of Computational Linguistics and In- telligent Text Processing. Mexico City: Springer, 2006:563 -566.
  • 8Van Rijsbergen C J. Information retrieval. 2nd ed. London: Butter- worth, 1979 : 19 - 23.

二级参考文献8

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部