期刊文献+

Bayes文本分类器的改进方法研究 被引量:11

Research on Improvement of Bayesian Text Classifier
下载PDF
导出
摘要 在文本分类领域,Bayes分类器是一种常用且效果较好的、基于概率的分类器,具有较严密的理论基础。该文对朴素Bayes文本分类器进行了分析,提出了利用权值调整机制改善其分类性能的方法,以及在缺乏大量训练文本的情况下,利用EM算法进行非监督Bayes分类的方法,并讨论了如何运用启发式方法确定Bayes网络结构,在更贴近真实环境的情况下进行文本分类。 Bayesian classification model is common, powerful for text categorization task. It is based on probability and is of religious theoretic basis. The paper makes analysis to the simple and common naive Bayesian categorization model, and presents an approach to improve performance of Bayesian classification model using weight adjustment and an approach to make non-tutor Bayesian categorization using EM algorithm when lacking mass training texts, and discusses how to fix the framework of Bayesian network using heuristic methods so as to make text classification in real circumstance.
作者 鲁明羽
出处 《计算机工程》 EI CAS CSCD 北大核心 2006年第17期63-65,共3页 Computer Engineering
基金 国家自然科学基金资助项目(60473115)
关键词 文本分类 朴素Bayes分类模型 权值调整 EM算法 Text categorization Naive Bayesian categorization model Weight adjustment EM algorithm
  • 相关文献

参考文献4

二级参考文献29

  • 1[1]Langley P,Iba W,Thompson K.An analysis of bayesian classifiers[A].Proceedings tenth national conference on artificial intelligence[C].Menlo Park,CA:AAAI Press,1992.223-228.
  • 2[2]Friedman N,Geiger D,Goldszmidt M.Bayesian network classifiers[J].Machine Learning,1997,29:131-163.
  • 3[3]Pearl J.Probabilistic reasoning in intelligent systems:Networks of plausible inference[M].San Francisco:Morgan Kaufman Publishers,1988.122-150.
  • 4[4]Chickering D M.Learning bayesian networks is NP-complete[A].Horvitz Eric,Jensen Finn V.Proceedings of the 12th conference on uncertainty in artificial intelligence[C].San Francisco:Morgan Kaufmann Publishers,1996.210-216.
  • 5[5]Dumais S,Platt J,Heckerman D,et al.Inductive learning algorithms and representations for text categorization[A].Makki K,Bouganim L.Proceedings international conference on information and knowledge management[C].New York:ACM Press,1998.148-155.
  • 6[6]Yang Y.An evaluation of statistical approaches to text categorization[J].Journal of Information Retrieval,1999,1(1/2):67-88.
  • 7[7]Lam W,Ho C Y.Using a generalized instance set for automatic text categorization[A].Moffat Alistair,Wilkinson Ross.Proceedings of the 21th annual international ACM SIGIR conference on research and development in information retrieval[C].New York:ACM Press,1998.81-89.
  • 8[8]Han E H,Karypis G,Kumar V.Text categorization using weight adjusted k-nearest neighbor classification[A].Cheung D,Williams G J,Li Q.Proceedings of the 5th Pacific Area conference on knowledge discovery and data mining (PAKDD 2001).Lecture notes in artificial intelligence (LNAI)[C].Berlin:Springer,2001.53-65.
  • 9[9]Yang Y,Chute C G.An application of least squares fit mapping to text information retrieval[A].Korfhage Robert,Rasmussen Edie,Willett Peter.Proceedings of 16th annual international ACM SIGIR conference on research and development in information retrieval[C].New York:ACM Press,1993.281-290.
  • 10[10]Mccallum A,Nigam K.A comparison of event models for naive bayes text classification[DB/OL].http://citeseer.nj.nec.com/mccallum98comparison.html.1999.

共引文献138

同被引文献54

引证文献11

二级引证文献94

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部