期刊文献+

一种改进的高效贝叶斯短信文本分类器 被引量:6

An Improved Efficient Bayesian Short Message Text Classifier
下载PDF
导出
摘要 针对短信分类问题,提出了分类能量空间的概念,将特征词转换为分类能量空间上的一个能量元,以此为基础计算短信的能量特征向量.通过计算短信能量特征向量的领域密度,结合贝叶斯公式输出了短信在不同分类的分类概率.在分类过程中,还对分类概率差别较小的短信采用支持向量机进行了二次分类以提高分类效果.实验结果表明,该分类器模型具有良好的分类效果. A Bayesian classifier model is proposed to classify short message according to its content. The concept of category energy space is introduced and the word feature is converted to an energy unit in category energy space. Then the short message is represented as an energy vector based on its words. To obtain each category’s probability, the energy vector density is calculated and brought in Bayesian probability formula. When the category probabilities are not very different,a SVM model is used to reclassify the short message. The experimental results shows that the proposed model is superior to other classification methods in the classification result.
出处 《南京师范大学学报(工程技术版)》 CAS 2014年第3期70-74,共5页 Journal of Nanjing Normal University(Engineering and Technology Edition)
基金 国家级星火计划项目 农村民生建设信息反馈平台建设项目(2011GA690190)
关键词 短信 文本分类 贝叶斯 支持向量机 分类能量空间 short message text classification Bayesian SVM category energy space
  • 相关文献

参考文献18

  • 1新浪科技.2012年我国短信量同比增2%人均发送量下滑[R/OL].[2013-1-28].http://teeh.sina.com.cn/t/2013-01-28/00538020096.shtml.
  • 2陈功平,沈明玉,王红,张燕平.基于内容的短信分类技术[J].华东理工大学学报(自然科学版),2011,37(6):770-774. 被引量:17
  • 3李继刚.短信自动分类技术研究与应用[D].上海:东华大学计算机科学学院,2011.
  • 4綦科,谢冬青.基于内容的短信分类系统的设计与实现[J].广州大学学报(自然科学版),2011,10(5):43-47. 被引量:2
  • 5张兢,候旭东,吕和胜.基于朴素贝叶斯和支持向量机的短信智能分析系统设计[J].重庆理工大学学报(自然科学),2010,24(1):77-80. 被引量:18
  • 6Ganiz M C. Higher order Naive Bayes : a novel non-IID approach to text classification [ J ]. IEEE Transactions on Knowledge and Data Engineering,2011,23 (7) : 1 022-1 034.
  • 7Zhang Haijun. Textual and visual content-based anti-phishing: a Bayesian approach [ J ]. IEEE Transactions on Neural Networks, 2011,22 ( 10 ) : 1 532-1 546.
  • 8Tak-Lam Wong, Wai Lam. Learning to adapt web information extraction knowledge and discovering new attributes via a Bayesian approach[ J ]. IEEE Transactions on Knowledge and Data Engineering,2010,22 (4) :523-536.
  • 9Belem D. Content filtering for SMS systems based on Bayesian classifier and word grouping [ C ]//Network Operations and Management Symposium ( LANOMS), Quito : IEEE Press,2011 : 1-7.
  • 10Uysal,Alper Kursat. Detection of SMS spam messages on mobile phones [ C]//Signal Processing and Communications Applications Conference( SIU), Mugla : IEEE Press ,2012 : 1-4.

二级参考文献35

共引文献48

同被引文献70

  • 1李国栋,李卫.基于文本分类技术的垃圾邮件识别系统[J].微电子学与计算机,2004,21(6):145-146. 被引量:10
  • 2刘静,尹存燕,陈家骏.一种规则和贝叶斯方法相结合的文本自动分类策略[J].计算机应用研究,2005,22(7):84-86. 被引量:7
  • 3PANGNING T,MICHAEL S,著.数据挖掘导论[M].范明、范宏建,译.北京:人民邮电出版社,2006:5.
  • 4谷波,李济洪,刘开瑛.基于COSA算法的中文文本聚类[J].中文信息学报,2007,21(6):65-70. 被引量:9
  • 5DELANY S J,BUCKLEY M,GREENE D.SMS spam filtering:methods and data[J].Expert Systems with Applications,2012,39(10):9899-9908.
  • 6ALI K,MANGANARIS S,SRIKANT R.Partial classification using association rules[C]∥The 3th International Conference on Knowledge Discovery and Data Mining.Colifornia:American Association for Artificial Intelligence,1997:115-118.
  • 7LIU B,HSU W,MA Y M.Integrating classification and association rule mining[C]∥The 4th International Conference on Knowledge Discovery and Data Mining.New York:American Association for Artificial Intelligence,1998.
  • 8LI W,HAN J,PEI J.CMAR:accurate and efficient classification based on multiple class-association rules[C]∥Data Mining,2001.Proceedings IEEE International Conference on.California:IEEE,2001:369-376.
  • 9YIN X,HAN J.CPAR:classification based on predictive association rules[C]∥SIAM International Conference on Data Mining.San Francisco:Army High Performance Computing Research Center and University of Illinois,2003:331-335.
  • 10DONG G,ZHANG X,WONG L,et al.CAEP:Classification by aggregating emerging patterns[C]∥Discovery Science.Berlin:Springer,1999:30-42.

引证文献6

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部