摘要
在面临突发大型公共事件时虚假信息的广泛传播将具有极大的破坏性。虚假信息的传播将严重干扰疫情的救治工作,针对以往传统分类模型存在特征稀疏,准确率不高等问题。提出了一种基于Word2Vec的疫情虚假信息检测方法。该方法使用Word2Vec模型训练词向量,解决了传统向量空间模型的特征稀疏问题,再引入TFIDF对词向量进行加权,最终将处理过后的数据输入到SVM模型。通过在国内新闻平台爬取的数据集上的实验验证,该方法较之传统方法,对虚假信息的检测在准确率上有4%以上的提升。
The widespread dissemination of false information in the face of sudden large-scale public incidents will be extremely destructive.The dissemination of false information will seriously interfere with the treatment of the epidemic.In response of the problems of sparse features and low accuracy in traditional classification models in the past,this paper proposes a method for detecting false information about the epidemic based on Word2 Vec.This method uses the Word2 Vec model to train word vectors,then solves the feature sparse problem of the traditional vector space model,and introduces TFIDF to weight the word vectors,finally inputs the processed data into the SVM model.Through the experimental verification on the data set crawled by the domestic news platform,this method has more than 4%improvement in the accuracy of detecting false information compared with the traditional method.
作者
齐浩翔
马莉媛
朱翌民
QI Haoxiang;MA Liyuan;ZHU Yimin(School of Electronic and Electrical Engineering,Shanghai University of Engineering Science,Shanghai 201620,China)
出处
《智能计算机与应用》
2021年第10期134-138,共5页
Intelligent Computer and Applications