摘要
构建好的文本向量表示对文本情感分类任务十分重要。针对文本中词语类别区分能力的不同,提出了一种用改进的TF-IDF加权Word2Vec的文本向量表示方法(ITIW),对类别区分能力不同的词语赋予不同的权重,将基于该方法构建的词向量作为卷积神经网络(CNN)的输入,设计了ITIW-CNN文本情感分类模型。该模型通过改进TF-IDF以区分不同词语的类别表示能力,计算词语的权重,进而得到词语的加权词向量表示(ITIW),将加权词向量输入CNN进行文本情感分类,促使模型具有更好的分类能力。实验结果表明:与传统的表示文本的分类算法相比,ITIW-CNN模型在各项指标上均有一定的提高,F 1值和分类准确率分别达到94.77%、92.80%。ITIW-CNN模型能够有效提升文本的情感分类效果。
Constructing a good text vector representation is very important for text sentiment classification.In this paper,an improved TF-IDF weighted Word2vec text vector representation is proposed to address the difference of category discrimination ability of words in text,which gives different weights to words with different category discrimination ability.The word vector constructed based on this method is used as the input of the convolutional neural network(CNN),and the ITIW-CNN text sentiment classification model is designed.The model improves TF-IDF to distinguish the category representation ability of different words,calculates the weight of words,and then obtains the weighted word vector representation(ITIW)of words.The weighted word vector is input into CNN for text emotion classification,so as to promote the model to have better classification ability.The experimental results show that,compared with the traditional text classification algorithm,the ITIW-CNN model has a certain improvement in each index,with the F 1 value and classification accuracy reaching 94.77%and 92.80%,respectively.The ITIW-CNN model proposed in this paper can effectively improve the sentiment classification effect of text.
作者
李昌兵
赵玲
李晓光
王利
LI Changbing;ZHAO Ling;LI Xiaoguang;WANG Li(Key Laboratory of Electronic Commerce and Modern Logistics,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处
《重庆理工大学学报(自然科学)》
CAS
北大核心
2021年第11期109-115,共7页
Journal of Chongqing University of Technology:Natural Science
基金
国家自然科学基金项目(60905066/F030707,71901045)
教育部人文社科规划基金项目(20YJAZH102)。