期刊文献+

一种基于嵌入式特征选择的垃圾邮件过滤模型 被引量:13

Lazy Learning Spam Filtering Model Based on Embedded Feature Selection
下载PDF
导出
摘要 垃圾邮件自身的特点决定了消极学习型的文本分类算法更加适合于垃圾邮件过滤问题.但是,以k-NN为代表的消极型文本分类算法却存在着运行效率偏低等诸多缺点,不便于实际使用.为此,该文在向量余弦相似性公式的基础上,提出了一种新的"嵌入式特征选择垃圾邮件过滤模型"和基于此模型的消极学习型垃圾邮件过滤算法.与一些经典算法相比,新算法在显著降低运算开销的同时,巧妙地避免了由此而引起的信息丢失问题,因而在性能与效率两个方面都有明显提高,具有非常高的实际价值. Although being more suitable than Eager Learning Text Categorization Approaches for spam filtering, Lazy Learning approaches are generally in lower efficiency. Moreover, they always need Feature Selection process to reduce dimensionality of feature space. This process will cause information losing to have side-effect on the whole performance of approaches. So the paper issued a new spam filtering model based on Embedded Feature Selection Mode, which can reduce dimensionality of feature space greatly without any information losing, the approach based on this model thus can improve both efficiency and performance greatly.
出处 《小型微型计算机系统》 CSCD 北大核心 2009年第8期1616-1620,共5页 Journal of Chinese Computer Systems
基金 国家"八六三"高技术研究发展计划基金项目(2006AA01Z455)资助
关键词 垃圾邮件过滤 机器学习 余弦相似性 嵌入式特征选择 spare filtering machine learning cosine similarity embedded feature selection
  • 相关文献

参考文献2

  • 1(美)Pang-NingTan,(美)MichaelSteinbach,(美)VipinKumar著,范明,范宏建等.数据挖掘导论[M]人民邮电出版社,2006.
  • 2Sarah Jane Delany,Pádraig Cunningham,Lorcan Coyle. An Assessment of Case-Based Reasoning for Spam Filtering[J] 2005,Artificial Intelligence Review(3-4):359~378

同被引文献76

引证文献13

二级引证文献61

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部