摘要
研究了基于SVM算法的改进朴素贝叶斯文本分类算法及在垃圾短信过滤中的应用。针对朴素贝叶斯算法条件独立性假设、过分依赖于样本空间的分布和内在不稳定性的缺陷,造成了算法时间复杂度的增加,提出了改进的基于SVM算法的朴素贝叶斯算法垃圾短信过滤的解决方案,充分结合了朴素贝叶斯算法高效分类和SVM算法增量学习及不依赖样本空间的特点;首先利用结构风险最小化原理和非线性变换将分类问题转化为二次寻优问题,最后利用朴素贝叶斯算法过滤短信,提高分类的准确度和稳定性;仿真实验结果表明,该算法能够快速得到最优分类特征子集,有效提高了垃圾短信过滤的准确率和分类速度。
This paper discusses improvement of native Bayesian text classification algorithms based on the SVM algorithm and applications in SMS spam filtering.For Bayesian algorithms requiring for assumptions of the conditional's independence,over-reliance on the distribution of sample space and the inherent instability of the defect,resulting in an increase in time complexity,a SVM-based algorithm solution is proposed to improve the simple Bayesian spam messages filtering,which is combined with efficient algorithms Bayesian classification and the advantage of SVM algorithm that it can incremental learns and does not rely on the characteristics of the sample space.First make structural risk minimization principle and the classification of non-linear transform into the second optimization problem,and finally the Bayesian filters the messages,to improve the classification accuracy and stability.Simulation results show that the algorithm can quickly obtain the optimal feature subset classification,effectively improve the accuracy of spam SMS filtering and classification speed.
出处
《计算机测量与控制》
CSCD
北大核心
2012年第2期526-528,551,共4页
Computer Measurement &Control