摘要
论文首先介绍了向量空间模型(VSM)方法以及特征向量抽取方法,推导和研究了引入“特征之间互相独立”假设的朴素贝叶斯分类算法.在此基础上提出了一种改进的贝叶斯算法,改进的贝叶斯算法假设一部分特征之间相互独立,比朴素贝叶斯分类算法更符合实际需要。并把它应用到反垃圾邮件中。最后介绍了贝叶斯过滤算法反垃圾邮件的基本步骤。
In this paper, we first introduced the vector space model (VSM) and the method of the feature vector extraction, Then deduced and analyzed the Naive Bayesian algorithm that on the supposition of "the characteristic to be mutually independent".On the basis of this,the paper introduced a new improved Bayesian algorithm.The improved Bayesian algorithm supposed that only part of the characteristics are mutually independent,It more Conforms to the actual need than the Naive Bayesian.Then applies it in the spam mail.Finally introduced the fundamental step of filtering spam mail with Bayesian algorithm.
作者
白东燕
BAI Dong-yan (Deparmaent of Power Electronics and Electric Drives,Shijiazhuang Railway Institute,Shijiazhuang 050043,China)
出处
《电脑知识与技术》
2007年第4期154-155,共2页
Computer Knowledge and Technology