期刊文献+

一种基于Bernoulli混合模型的不完整数据文本分类方法

Classification text with incomplete data based on Bernoulli mixture mode
下载PDF
导出
摘要 在Bernoulli混合模型和期望最大化(EM)算法的基础上给出了一种基于不完整数据的改进方法。首先在已标记数据的基础上通过Bernoulli混合模型和朴素贝叶斯算法得到似然函数参数估计初始值,然后利用含有权值的EM算法对分类器的先验概率模型进行参数估计,得到最终的分类器。实验结果表明,该方法在准确率和查全率方面要优于朴素贝叶斯文本分类。 It is an important issue to construct the text classification with incomplete data. An improved method that based on Bernoulli Mixture Model and Expectation Maximization(EM) algorithm was introduced. Based on Bernoulli Mixture Model and EM algorithm, by learning the labeled data, the initial value of likelihood function parameter was obtained first. Then the parameter estimate of prior probability model on the classifier with EM algorithm including weight was presented. Finally we got the improved classifier. The results show that our new method is better than the naive hayes text classification in the recall and precision.
出处 《计算机应用》 CSCD 北大核心 2007年第5期1235-1237,1250,共4页 journal of Computer Applications
基金 2004年教育部优秀人才支持计划资助项目(NCET-04-0496) 教育部重点科学研究项目(105087)
关键词 不完整数据集 文本分类 朴素贝叶斯分类 Bernoulli混合模型 期望最大化算法 incomplete data text classification naive bayes classification Bernoulli mixture model Expectation Maximization algorithm(EM)
  • 相关文献

参考文献9

  • 1RONALD R,YAGER.An extension of the naive bayesian classifier[J].Information Sciences,2006,176:577-588.
  • 2RAMONI M,SEBASTIATI P.Robust bayes classifiers[J].Artificial Intelligence,2001,125(2):209 -226.
  • 3QUINLAN JR.G4.5:Programs for machine learning[M].San Francisco,CA:Morgan Kaufnann,1993.
  • 4KAMAL N,ANDREW KM,SEBASTIAN T,et al.Text classification from labeled and unlabeled documents using EM[J].Machine Learning,2000,39:103-134.
  • 5TSURUOKA Y,TSUJII J.Training a naive bayes classifier via the EM algorithm with a class distribution constrain[A].Seventh Conference on Natural Language Learning[C].Edmonton,Alberta,Canada:Walter Daelemans and Miles Osborne,2003.127-134.
  • 6JUAN A,VIDAL E.On the use of Bernoulli mixture models for text classification[J].Pattern Recognition,2002,35(12):2705 -2710.
  • 7HAN JW,KAMBER M.数据挖掘:概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2003.
  • 8MITCHELL TM.机器学习[M].曾华军,张银奎.北京:机械工业出版社,2003.
  • 9BLAKE C,KEOGH E,MERZ C.UCI repository of machine learning database[EB/OL].http://www.ics.uci.edu/mlearn/ML Repository.Html,2006 -09 -05.

共引文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部