期刊文献+

基于Doc2Vec与SVM的聊天内容过滤 被引量:5

Chat Content Filtering Based on Doc2Vec and SVM
下载PDF
导出
摘要 直播系统中用户聊天内容的实时拦截具有非常重大的意义,为了提高分类的准确率和效率,提出了一种基于Doc2Vec与SVM结合的文本分类模型对聊天内容分类,判断聊天内容是否应该被拦截.首先使用Doc2Vec模型将聊天内容表示成密集数值向量的形式,第二部分使用SVM分类器进行分类.通过实验表明,该模型有效地减少了文本表示的维度,提高了训练效率,而且具有的97%的准确率和89.82%召回率,性能优于朴素贝叶斯和基于Doc2Vec的Logistic模型. The real-time interception of user chat content in live broadcast system is of great significance.In order to improve the accuracy and efficiency of the classification,a text classification model based on the combination of Doc2Vec and SVM is proposed to classify the chat content and judge whether the chat content should be intercepted.The First part uses the Doc2Vec model to represent the chat content as a dense numeric vector,and then an SVM classifier is used to classify.The experimental results show that the model greatly reduces the dimension of text representation with high efficiency,and it has excellent accuracy rate(97%)and recall rate(89.82%),which are superior to Naive Bayes and the logistic based on Doc2Vec.
作者 岳文应 YUE Wen-Ying(School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China)
出处 《计算机系统应用》 2018年第7期127-132,共6页 Computer Systems & Applications
关键词 文本分类 自然语言处理 Doc2Vec模型 支持向量机 text classification Natural Language Processing (NLP) Doc2Vec model Support Vector Machine (SVM)
  • 相关文献

参考文献1

二级参考文献10

  • 1边肇祺 张学工.模式识别,第二版[M].北京:清华大学出版社,1999.223-226.
  • 2Sahami M, et al. A Bayesian Approach to Filtering E-Mail. http://robotics.stanford. edu/users/sahami/papers-dir/. spam. ps, 1998
  • 3Sahami M. Using Machine Learning to Improve Information Ac cess: [PhD thesis].Stanford University, Dec. 1998.11~29,170~ 180
  • 4Rennie J D M. ifile: An Application o[ Machine Learning to EMail Filtering.http://www. cs. cmu. edu/~ jr6b/papers/ifile98.ps,1998
  • 5Cohen W W. Learning rules that classify E-Mail. In: Proc. of the AAAI SpringSymposium on Machine Learning in Information Access,1996
  • 6Payne T. Learning Email Filtering Rules with Magi A Mail Agent Interface,MScThsis,University of Aberdeen,Scotland,1994
  • 7Lewis D D. Feature Selection and Feature Extraction for Text Categorization.http://www. research. att. com/~ lewis/chrono- bib. html/lewis92e. ps, 1992
  • 8Nigam K,et al. Using EM to Classify Text from Labeled and Un labeled Document.http://www-2. cs. emu. edu/pepole/mccallum/emcat-mlj2000. ps, 1998
  • 9McCallum A,Nigam K. A Comparison of Event Model for Naive Bayes TextClassification. http://www-2. cs. cmu. edu/people/mccallum/multionmial-aaai98w. ps, 1998
  • 10林亚平.概率分析进化算法及其研究进展[J].计算机研究与发展,2001,38(1):43-49. 被引量:27

共引文献9

同被引文献47

引证文献5

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部