摘要
随着智能手机的普及,短信在给人们生活带来便利的同时,也引发了诸如短信诈骗、短信骚扰、违法信息传播等信息安全问题。由于传统基于朴素贝叶斯分类进行短信过滤的方法在后验概率接近的情况下,分类效果并不理想。论文提出一种多层次的短信过滤方法。该方法首先结合阈值与特征评分的方法,提高垃圾短信分类的准确率;其次,在此方法的基础上,引入增量学习机制,解决由于短信的时新性、复杂性带来的误判。实验结果表明相较于朴素贝叶斯分类及单独改进的方法,多层次过滤的改进方法能有效提高短信分类的正确率。
With the popularity of smart phones,text message brings convenient on peoples life,but at the same time it causes many information security problems such as fraud,harassment and illegal information spreading.Considering some traditional filtering methods which based on naive bayes classification are not effective in some circumstances,this paper proposes a multi-level filtering method which combines threshold method,feature score method,and incremental learning mechanism.The experimental results show that compared with naive bayes and single improved method,this multi-level filtering method can effectively improve the accuracy of text classification.
出处
《计算机与数字工程》
2016年第9期1752-1756,1781,共6页
Computer & Digital Engineering
基金
云南省软件工程重点实验室开放基金(编号:2010KS01)资助
关键词
垃圾短信
朴素贝叶斯
文本分类
特征评分
阈值
增量学习
spam messages
Naive Bayes
text classification
feature score
threshold
incremental learning