摘要
传统分布式大型邮件系统对海量数据的处理虽然具有比较高的准确性,但是具有效率低、程序复杂、占用资源多等缺点,因此,需要在传统过滤技术的基础上进行优化。采用Hadoop平台贝叶斯过滤算法的Mapreduce框架,对系统配置进行优化,能够得到比较全面的知识库,不仅提高了邮件过滤阶段的可靠性,还使实验性能评价指标有了很大的提升。
Traditional distributed large - scale mall system deals with the massive data. Al- though it has relatively high accuracy, but it has low efficiency, complex proceduce and oth- er drawbacks. Therefore, it needs to be optimized by traditional filtering technology. Hadoop platform Bayesian filter algorithm Mapreduce framework is adopted to optimize system con- figuration and could get a more comprehensive knowledge base, which improves nots only the reliability of the mail filter phase, but also experimental performance of evalution index.
出处
《沈阳理工大学学报》
CAS
2017年第6期42-46,共5页
Journal of Shenyang Ligong University