期刊文献+

基于最大熵的文本分类算法的改进

Improvement of text categorization algorithm based on maximum entropy
下载PDF
导出
摘要 基于最大熵模型的文本分类算法对不同测试文档的训练结果相差较大.利用Boosting机制改进基于最大熵模型的分类算法,以提高该分类算法的稳定性.实验结果表明,该改进方法可以有效改善基于最大熵模型分类算法的稳定性,且分类精度也有一定的提高. The text categorization algorithm based on maximum entropy model is a kind of effective method,and it has better performance than Bayes,KNN,SVM and etc.,which are the typical text categorization algorithms.But it has different training results to different testing documents,that is,the stability of it is worse.For this reason,the algorithm is improved using boosting mechanism in order to advance its stability.Experimental results show that the improving method is valid in improving the stability and the classification accuracy of the text categorization algorithm based on maximum entropy model.
出处 《西安石油大学学报(自然科学版)》 CAS 北大核心 2009年第6期77-79,共3页 Journal of Xi’an Shiyou University(Natural Science Edition)
关键词 文本分类算法 最大熵模型 BOOSTING算法 稳定性 text categorization algorithm maximum entropy model boosting mechanism stability
  • 相关文献

参考文献6

  • 1John F Gantz, David Reinsel, Christopher Chute, et al. The Expanding Digital Universe:A Forecast of Worldwide Information Growth Through 2010[ R]. IDC ,2007.
  • 2Yang Yiming, Liu Xin. A re-examination of text categorization methods [ C ]. Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR99), 1999:42-49.
  • 3Li Y H, Jain A K. Classification of text documents [ J ]. The Computer Journal, 1998,41 ( 8 ) :537-546.
  • 4Schapire R E, Singer Y. BoosTexter: A Boosting-based System for Text Categorization [ J ]. Machine Learning, 2000,39 ( 2/3 ) : 135-168.
  • 5Sebastiani F. Machine Learning in Automated Text Categorization [ J ]. ACM Computing Surveys, 2002,34 ( 1 ) : 1 - 47.
  • 6Meir R, Atsch G R. An Introduction to Boosting and Leveraging[ C]. Advanced Lectures on Machine Learning, LNCS. Heidelberg, DE Springe :2003.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部