期刊文献+

基于逻辑回归模型的中文垃圾短信过滤 被引量:2

Chinese short messages service spam filtering based on logistic regression
下载PDF
导出
摘要 设计并实现中文垃圾短信过滤器,能够较好识别不断变化的垃圾短信。以逻辑回归模型为基础,提出字节级n元文法提取短信特征,并采用TONE(Train On or Near Error)方法训练过滤器。通过实验测试,证明应用该方法实现的垃圾短信过滤效果很好。 We designed and implemented the Chinese short messages service spam filter which can defense the evolution of short message spam.Logistic regression model is used as its filtering model;byte level N-gram is put forward to extract message spam's features;and the filter is trained with TONE(Train On or Near Error) method.The performance evaluated by experiments of the short messages spam filter based on the method is pretty good.
出处 《黑龙江工程学院学报》 CAS 2010年第4期36-39,共4页 Journal of Heilongjiang Institute of Technology
基金 黑龙江省教育厅资助项目(11551401)
关键词 中文垃圾短信过滤 逻辑回归模型 N元文法 TONE Chinese short messages service spam filtering logistic regression model N-gram TONE
  • 相关文献

参考文献10

  • 1G.HULTEN,J.GOODMAN.Tutorial on junk e-mail filtering[Z].The Twenty-First International Conference on Machine Learning (ICML 2004),2004.
  • 2G.V.CORMACK.University of Waterloo Participation in the TREC 2007 Spam Track[C].The Sixteenth Text REtrieval Conference (TREC 2007) Proceedings.Gaithersburg,Maryland,USA,2007.
  • 3F.SEBASTIANI.Machine learning in automated text categorization[J].ACM Computing Surveys,2002.34,1:1–47.
  • 4JOACHIMS,T.2001.A statistical learning model of text classification with support vector machines[Z].En Proceedings of the 24th ACM International Conference on Research and Development in Information Retrieval.ACM Press.
  • 5齐浩亮,程晓龙,杨沐昀,何晓宁,李生,雷国华.高性能中文垃圾邮件过滤器[J].中文信息学报,2010,24(2):76-83. 被引量:7
  • 6CORMACK G.V.BRATKO A.,Batch and On-line Spam Filter Evaluation[Z].CEAS 2006-Third Conference on Email and Anti-Spam,Mountain View,2006.
  • 7HAOLIANG.QI.Online Linear Discriminative Spam Filter With Generative Feature Extraction[Z].5th International Conference on Fuzzy Systems and Knowledge Discovery,2008.
  • 8J.M.M.CRUZ,G.V.CORMACK.Using old Spam and Ham Samples to Train Email Filters[Z].6th Conference on Email and Anti-Spam.in Mountain View,California,USA,2009.
  • 9JIANFENG Gao,HAOLIANG QI,XINSONG XIA,et al.Linear discriminant model for information retrieval[Z].SIGIR 2005:290-297.
  • 10GORDON V.CORMACK,THOMAS R.et al.Online supervised spam filter evaluation[J].ACM Transactions on Information Systems (TOIS),2007,25(7):3-11.

二级参考文献13

  • 1G. Cormack, T. Lynam. TREC 2005 Spare Track Overview[C]//The Fourteenth Text REtrieval Conference (TREC 2005 ) Proceedings. Gaithersburg, MD, USA. 2005.
  • 2V. N. Vapnik. Statistical Learning Theory[M]. New York, USA: John Wiley & Sons, Inc. 1998:1-18.
  • 3A. Bratko, B. Filipi?, G.V. Cormack et al. Spare Filtering Using Statistical Data Compression Models [J]. The Journal of Machine Learning Research archive, 2006,7:2673-2698.
  • 4G. Hulten and J. Goodman. Tutorial on Junk E-mail Filtering[C]//The Twenty-First International Conference on Machine Learning (ICML 2004). 2004: (Invited Talk, http://research. microsoft. com/en-us/um/ people/joshuago/ icmltutorialannounce. htm).
  • 5D. Sculley, G. M. Wachman. Relaxed Online SVMs for Spam Filtering[C]//The 30th Annual International ACM SIGIR Conference (SIGIR' 07). New York, NY, USA:ACM, 2007:415-422.
  • 6J. Goodman and W. Yih. Online Discriminative Spare Filter Training[C]//Third Conference on Email and Anti-Spare (CEIAS 2006). Mountain View, California, USA. 2006: 113-115. (http://www. eeas. cc/ 2006/22. pdf).
  • 7D. Sculley. Advances in Online Learning-based Spam Filtering [D]. Medford, MA, USA: Tufts University. 2008.
  • 8P. Hayati, V. Potdar. Evaluation of spam detection and prevention frameworks for email and image spam: a state of art[C]//International Conference on Information Integration and web-based Applications and Services (iiWAS 2008) workshops: Proceedings of the 10th International Conference on Information Integration and Web based Applications & Services (AIIDE 2008). New York, NY, USA: ACM. 2008: 520-527.
  • 9G. V. Cormack, A. Bratko. Batch and Online Spam Filter Comparison. [C]//Third Conference on Email and Anti-Spam (CEAS 2006). Mountain View, California, USA. 2006.
  • 10J.M. M. Cruz, G. V. Cormack. Using old Spare and Ham Samples to Train Email Filters[C]//6th Conference on Email and Anti-Spam. in Mountain View, California, USA, 2009.

共引文献6

引证文献2

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部