摘要
针对垃圾邮件过滤中基于邮件内容过滤的需求,采用向量空间模型在真实垃圾邮件组成的示例文档集的基础上提取向量模型。当接收到新邮件,根据向量模型和相关加权方法提取邮件向量,利用余弦函数计算两向量相似度.实现对垃圾邮件的过滤。
With the necessary for spam filtering based on content, a Vector Space Model is used to retrieve a vector mode according a real documents set of spam. When receive a new E-mail, referencing vector mode and related weighted method retrieves the vector space of the e-mail, At last use the cosine function computing the similar degree, and implement the spam filtering.
出处
《仪器仪表用户》
2007年第1期97-98,共2页
Instrumentation
关键词
内容过滤
向量空间模型
相似度
filtering based on content
Vector Space Model
similar degree