期刊文献+

云环境下NB算法的垃圾邮件过滤研究 被引量:4

Research of Spam Filtering Based on NB Algorithm in Cloud Environment
下载PDF
导出
摘要 朴素贝叶斯算法在解决垃圾邮件分类领域内具有较高的准确性,能够很好的将邮件区分开来,但是在分类前期的训练阶段却会大量耗用系统和网络资源,严重影响分类效率.为此引入spark平台.以并行的思想去解决邮件分类问题,利用spark计算平台RDD的血缘关系合理的安排NB邮件分类的各个过程.实验结果表明,与其他传统的分类方法对比而言,朴素贝叶斯在精确率,召回率等方面具有很好的效果,并且与传统单机下的邮件分类,本次实验因引入分布式的思想,利用spark集群的优势大大加快了分类的速率. Nave Bayes algorithm has high accuracy in solving the spam classification field and can distinguish the mail very well.However,in the pre-classification training phase,it consumes a lot of system and network resources and seriously affects the classification efficiency.Spark platform for this introduction.With parallel thinking to solve the problem of mail classification,the use of spark computing platform RDD kinship rational arrangement of NB mail classification of the various processes.The experimental results show that,compared with other traditional classification methods,Nave Bayes has a good effect on the Precision and Recall rate,etc.,and with the traditional mail classification under single machine,this experiment because of the introduction of distributed thinking,The use of spark clusters greatly accelerate the classification speed.
作者 刘月峰 张亚斌 苑江浩 LIU Yue-feng;ZHANG Ya-bin;YUAN Jiang-hao(School of Information Engineering,Inner Mongolia University of Science and Technology,Baotou 014010,China)
出处 《微电子学与计算机》 CSCD 北大核心 2018年第8期60-63,共4页 Microelectronics & Computer
关键词 垃圾邮件 朴素贝叶斯 spark计算平台 分布式 spam email naive bayes spark computing platform distributed
  • 相关文献

参考文献8

二级参考文献103

共引文献75

同被引文献46

引证文献4

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部