期刊文献+

基于后验概率制导的B-KNN文本分类方法 被引量:1

B-KNN Text Categorization Method Based on Posterior Probability Guidance
下载PDF
导出
摘要 针对K最近邻(KNN)方法分类准确率高但分类效率较低的特点,提出基于后验概率制导的贝叶斯K最近邻(B-KNN)方法。利用测试文本的后验概率信息对训练集多路静态搜索树进行剪枝,在被压缩的候选类型空间内查找样本的K个最近邻,从而在保证分类准确率的同时提高KNN方法的效率。实验结果表明,与KNN相比,B-KNN的性能有较大提升,更适用于具有较深层次类型空间的文本分类应用。 Considering K Nearest Neighbor(KNN) method has high accuracy but poor efficiency,this paper proposes a text categorization method based on the guidance of posterior probability named B-KNN.By using the posterior probabilities collected from the training text,B-KNN prunes the multi-branch-static-searching tree of the training dataset and reduces the candidate class set where K nearest neighbors can be found so that the efficiency of KNN method can be improved while preserving its classification accuracy.Experimental results show that B-KNN method remarkably outperforms KNN method,and it is more suitable for classification tasks with deep hierarchy categorization space.
出处 《计算机工程》 CAS CSCD 北大核心 2011年第21期114-116,共3页 Computer Engineering
基金 国家自然科学基金资助项目(60975034)
关键词 文本分类 后验概率 贝叶斯分类器 K最近邻方法 贝叶斯K最近邻方法 text categorization posterior probability Bayesian classifier K Nearest Neighbor(KNN) method B-KNN method
  • 相关文献

参考文献6

  • 1Yang Yiming, Lin Xin. A Re-examination of Text Categorization Methods[C]//Proc. of the 22nd Annual Int'l Conf. on Research and Development in Information Retrieval. Berkley, USA: ACM Press, 1999: 42-49.
  • 2Guo Gongde, Wang Hui, Bell D, et al. KNN Model-based Approach in Classification[C]//Proc. of the 3rd Int'l Conf. on Intelligent Text ProCessing and Computational Linguistics. New York, USA: Springer-Verlag, 2004: 559-570.
  • 3Hart P E. The Condensed Nearest Neighbor Rule[J]. IEEE Trans. on Information Theory, 1968, 14(3): 515-516.
  • 4焦庆争,蔚承建.一种基于特征投票的文本分类方法[J].计算机工程,2010,36(9):200-202. 被引量:1
  • 5Fonseca M J, Jorge J A. Indexing High-dimensional Data for Content-based Retrieval in Large Databases[C]//Proc. of the 8th Int'l Conf. on Database Systems for Advanced Applications. Kyoto, Japan: Is. n.], 2003.
  • 6周水庚 周傲英.一种新的基于R0cchi0和KNN的文本分类方法.计算机研究与发展,2004,:226-226,230.

二级参考文献6

  • 1Yang Yiming,Liu Xin.A Re-examination of Text Categorization Methods[C]//Proceedings of ACM SIGIR'99.Berkeley,CA,USA:ACM Press,1999:42-49.
  • 2Yang Yiming.An Evaluation of Statistical Approaches to Text Categorization[J].Information Retrieval,1999,1(1/2):69-90.
  • 3Josang A.A Model for Trust in Security Systems[C]//Proceedings of the 2nd Nordic Workshop on Secure Computer Systems.Philadelphia,USA:ACM Press,1997.
  • 4JΦsang A,Knapskog S J.A Metric for Trusted Systems[C]// Proceedings of the 21st National Security Conference.Gaithersburg,MD,USA:NIST Press,1998:16-29.
  • 5JΦsang A,Ismail R.The Beta Reputation System[C]//Proceedings of the 15th Bled Conference on Electronic Commerce.Bled,Slovenia:[s.n.],2002:17-19.
  • 6Bressan M,Vitria J.On the Selection and Classification of Independent Features[J].IEEE Trans.on Pattern Analysis and Machine Intelligence,2003,25(10):1312-1317.

同被引文献13

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部