摘要
通过对比垃圾邮件的变化,以及对垃圾邮件制造者目的的分析,总结出中文垃圾邮件特性变异的趋势。针对中文垃圾邮件变异的几个重要特性,以及中文的语言特点,在反垃圾邮件系统中加入拼音转换模块、繁简体转换模块和正则表达式匹配模块。实验结果表明,该方法能够取得较高的垃圾邮件变异特征识别率。
By comparing the changes of spam and analyzing purpose of spam senders, summaries the trends of Chinese spam variation features. De- pends on several important characteristics of the Chinese spam variation and the characteristics of Chinese language, the Pinyin conver- sion module, the traditional and simplified conversion module, and the regular expressions module are added to the anti-spam system. Test results show that the proposed method can well detect variation features of spam.
基金
广东省部产学研结合项目(No.2011A090200072)
广东省大学生创新训练计划项目(No.1056412154)
关键词
中文垃圾邮件
变异特征
特征提取
Chinese-spare
Variation Features
Features Extraction