期刊文献+

基于改进的最大熵均值聚类方法在文本分类中的应用 被引量:4

Application of text categorization based on improved maximum entropy means clustering algorithm
下载PDF
导出
摘要 针对传统的文本分类算法存在着各特征词对分类的结果影响相同、分类准确率较低、造成算法时间复杂度增加的问题,提出了一种改进的最大熵C-均值聚类文本分类方法。该方法充分结合了C-均值聚类和最大熵值算法的优点,以香农熵作为最大熵模型中的目标函数,简化分类器的表达形式,然后采用C-均值聚类算法对最优特征进行分类。仿真实验结果表明,与传统的文本分类方法相比,提出的方法能够快速得到最优分类特征子集,大大提高了文本分类准确率。 In view of the traditional text classification algorithm has the problems of the characteristics having same influence on classification results,the low rate of classification accuracy,and the increasing of the algorithm time complexity,this paper presented an improved maximum entropy C-means clustering text classification methods.This method combined the C-means clustering algorithm and the maximum entropy algorithm,set Shannon entropy as a maximum entropy model in the target function,simplified classifier forms of expression,and then used the C-means clustering algorithm to the optimal features for classification.The simulation results show that,compared with traditional text classification methods,the proposed method can fast obtain the optimal classification feature subset,greatly improve the accuracy of text classification.
作者 张爱科
出处 《计算机应用研究》 CSCD 北大核心 2012年第4期1297-1299,共3页 Application Research of Computers
基金 广西教育厅科研项目基金资助项目(200911LX486 201106LX745)
关键词 文本分类 最大熵 C-均值聚类 特征选择 text classification maximum entropy C-means clustering feature selection
  • 相关文献

参考文献11

二级参考文献68

共引文献252

同被引文献43

  • 1高寅生.安全漏洞库设计与实现[J].微电子学与计算机,2007,24(3):99-101. 被引量:9
  • 2董振东.[EB/OL].知网http://www.keenage.com,1999.
  • 3杨淑莹.模式识别与智能计算[M].北京:电子工业出版社,2011.
  • 4Crammer K, Gentile C. Multiclass classification with ban- dit feedback using adaptive regularization [ J ]. Machine Learning,2013,90:357 - 383.
  • 5Wenbin Zheng, Lixin An, Zhanyi Xu. Dimensionality Re- duction by Combining Category Information and Latent Semantic Index for Text Categorization [ J]. Journal of In- formation & Computational Science, 2013,10 ( 8 ) : 2463 - 2469.
  • 6Bin Zhang, Alex Marin, Brian Hutchinson. Learning Phrase Patterns for Text Classification [ J ]. IEEE Trans- actions on audio, speech, and language processing,2013, 21 (6) :1180 - 1189.
  • 7Baccianella S, Esuli A, Sebastiani F. Using micro-docu- ments for feature selection: The case of ordinal text classi- fication [ J ]. Expert Systems with Applications, 2013,40 : 4687 - 4696.
  • 8Djeddi C, Siddiqi I, Souici-Meslati L. Text-independent writer recognition using multi-script handwritten texts [ J ]. Pattern Recognition Letters,2013,34 : 1194 - 1202.
  • 9刘群,李素建.基于《知网》的词汇语义相似度计算[J].计算语言学及中文信息处理,2002,7:59-76.
  • 10Bahojb I M, Reza K M, Reza A. A novel embedded fea- ture selection method:Acomparative study in the applica- tion of text categorization [ J ]. Applied Artificial Intelli- gence ,2013,27(5) :408 -427.

引证文献4

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部