摘要
分析邮件特征对邮件分类的影响,提出了双层分类方法并用于邮件服务智能代理。它包括邮件长度分类、邮件采集与预处理、文本分词、特征选取和邮件分类器等功能模块。此代理不仅可使邮件服务器具有自动过滤垃圾邮件的能力,也可以用于电子政务和电子商务,对邮件自动分类和转发。该双层分类方法首先对邮件按长度进行分类,然后根据邮件的不同长度类分别使用不同的贝叶斯分类器,从而实现垃圾邮件的过滤。实验表明它有效地提高了邮件分类的效率。
The influence of E-mail structure on mail classification is analyzed, and bi-layer classification method applied to intelligent agent for mail sever is proposed including these modules: Mail length classification, collection and pretreatment of E-mail, Chinese words segmentation, feature selection and the classification of E-mail. This Agent can not only support the ability of automatically filter spam, but also apply to E-government E-business for automatic classification and transmission orE-mail. First mails are classified according to different length, and then are trained by different Bays classifier for filtering spam. Experiments indicated that the bi-layer classification method is of high classification precision and efficiency.
出处
《计算机工程与设计》
CSCD
北大核心
2007年第3期683-686,共4页
Computer Engineering and Design
关键词
邮件分类
贝叶斯分类器
特征选取
中文分词
双层分类
mail classification
Bays classifier
feature selection
Chinese word segmentation
bi-layer classification