期刊文献+

基于改进TFIDF的文本特征选择方法 被引量:1

Text Feature Selection Method Based on Improved TFIDF
下载PDF
导出
摘要 分析几种常见的特征选择评价函数,将权值计算函数应用于特征选择,提出一种新的基于改进TFIDF的文本特征选择评价函数,即TFIDF-Dac。它从提高特征项的类区分能力角度考虑,将特征项在类间的分布信息引入公式,弥补了传统的TFIDF的不足。实验测试表明,使用改进的特征选择方法能够有效提高文本分类的准确度。 Analyzes several common evaluation functions for feature selection, then applies the terms weight function in feature selection, proposes a new evaluation function based on improved TFIDF method called TFIDF-Dac. In order to show the ability of separating from categories, introduces the distribution information between categories of feature item in this new method and it made up the disadvantages of the traditional TFIDF. Experiments have proved that the improved feature selection method can effectively improve the precision of the text categorization result.
出处 《现代计算机》 2009年第7期34-36,86,共4页 Modern Computer
关键词 文本分类 特征选择 评价函数 TFIDF Text Classification Feature Selection Evaluation Function TFIDF
  • 相关文献

参考文献4

二级参考文献24

  • 1王聃,贾云伟,林福严.人脸识别系统中的特征提取[J].微计算机信息,2005,21(07X):53-55. 被引量:18
  • 2James Auen.Natural Language Understandin[M].The Benjamin/Cummings Publishing Company, 1991-05.
  • 3Apte C,Damerau F J,Weiss S M.Automated Learning of Decision Rules for Text Categorization[J].ACM Trans On Inform Syst,12(3): 233-251.
  • 4Salton G,Buckley B.Term-weighting Approaches in Automatic Text Retrieval[J].Information Processing and Management, 1998 ; 24(5 ) :513 -523.
  • 5Larkey L S.A Patent Search and Classification System[C].In:proceedings of DL-99,4th ACM Conference on Digital Libraries Berkeley,CA,1999:179-187.
  • 6Salton G,Lesk M E.Computer Evaluation of Indexing and Text Processing[J].Association for Computing Machinery, 1968 ; 15 ( 1 ) : 8-36.
  • 7Y.Yang.A Comparative Study on Feature Selection in Text Categorization[C].In: Proceeding of the Fourteenth International Conference on Machine Learning (ICML'97),412-420,1997.
  • 8Mlademnic,D.,Grobelnik,M.Feature Selection for unbalanced class distribution and Naive Bayes[A].Proceedings of the Sixteenth International Conference on Machine Learning [C].Bled:Morgan Kaufmann, 1999:258-267.
  • 9Lewis DD. Feature selection and feature extraction for text categorization [A].Proc. of Speech and Natural Language Workshop,February 1992.212-217.
  • 10梁久祯,兰东俊.基于先验知识的网页特征压缩与线性分类器设计[C].第十二届全国神经计算学术大会讨论文集.北京:人民邮电出版社,2002:494-501.

共引文献304

同被引文献13

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部