期刊文献+

基于词贡献度的垃圾短信分类方法 被引量:3

A spam short message classification method based on word contribution
原文传递
导出
摘要 针对垃圾短信分类问题,提出了一种以词贡献度为基础的分类方法。该方法引入词贡献度的概念表达词在不同短信分类中的权重差别,通过构建词贡献度——分类矩阵和计算矩阵行均方差来实现降维,以词贡献度为基础计算短信隶属于短信分类的隶属度,并通过比较隶属度密度的方法解决分类冲突问题。实验结果表明,该方法在分类效果和实时性方面优于其他常用垃圾短信分类方法。 A classification method based on word contribution was proposed to classify spam short messages. The con- cept of word contribution was introduced for representing weight difference of a word in different categories, the word contribution-classification matrix was constructed, then the mean square deviation of each row in the matrix was compu- ted to reduce dimensionalities. To determine the classification a short message belongs to, short message-category mem- bership degrees were calculated based on word contribution. Furthermore if category candidates were more than one, the classification conflict problem could be resolved by comparing the densities of short message-category membership de- gree. The experimental results showed that the proposed method was superior to other classification methods in the clas- sification result and real-time.
出处 《山东大学学报(工学版)》 CAS 北大核心 2012年第5期87-90,共4页 Journal of Shandong University(Engineering Science)
基金 江苏省教育厅高校哲学社会研究资助项目(2012SJD87001)
关键词 垃圾短信 文本分类 词贡献度 方差 降维 spam short message text classification word contribution variance dimensionality reduction
  • 相关文献

参考文献19

  • 1SALTON G, WANG A, YANG C S. A vector space model for automatic indexing [ J ]. Communication of the ACM, 1975, 18(5):613-620.
  • 2ESIN Y E, ALAN O, ALPASLAN F N. Improvement on corpus-based word similarity using vector space models [ C ]// 24th International Symposium on Computer and Information Sciences. Guzelyurt: Middle East Technical University Press, 2009: 280-285.
  • 3LEWIS D. Feature selection and feature extraction for text categorization[C]// Proceedings of Speech and Natural Language Workshop. San Mateo: Morgan Kaulinann Press, 1992: 212-217.
  • 4张玉芳,彭时名,吕佳.基于文本分类TFIDF方法的改进与应用[J].计算机工程,2006,32(19):76-78. 被引量:120
  • 5刘金岭.基于查询词扩展的中文垃圾短信检索[J].计算机工程,2011,37(8):52-54. 被引量:6
  • 6刘金岭,严云洋.基于上下文的短信文本分类方法[J].计算机工程,2011,37(10):41-43. 被引量:13
  • 7BELEM D. Content filtering for SMS systems based on Bayesian classifier and word grouping[C]//Network Operations and Management Symposium ( LANOMS ), 2011 7th Latin American. Quito: IEEE Press, 2011 : 1-7.
  • 8UYSAL A. Detection of SMS spam messages on mobile phones [C]//Signal Processing and Communications Applications Conference ( SIU), 2012 20th. Mugla: IEEE Press, 2012: 1-4.
  • 9KHEMAPATAPAN C. Thai-English sparn SMS filtering [C]// Communications (APCC), 2010 16th Asia-Pacific. Auckland: IEEE Press, 2010: 226-230.
  • 10CAI Jie, TANG Yuezhong, HU Rile. Spam filter for short messages using winnow [C]// International Conference on Advanced Language Processing and Web Information Technology. Dalian : IEEE Press, 2008 : 454-459.

二级参考文献27

共引文献149

同被引文献24

引证文献3

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部