摘要
为解决文本分类中因文本数据篇幅长且语义情感分布不均导致分类准确度偏低的问题,提出一种基于分层式卷积神经网络(convolutional neural network,CNN)的长文本情感分类模型pos-ACNN-CNN。通过在嵌入层加入位置编码来捕获文本中的词序信息,结合基于注意力机制的CNN识别不同词语的情感语义贡献度,得到连续两个句子组成的句子对的特征信息;利用CNN提取文本中所有句子对的全局特征,获得最终的分类结果。在IMDB影评数据集中进行的多组对比实验结果表明,该情感分类模型具有更好的分类效果。
To solve the problem of low classification accuracy due to the long length of text data and uneven semantic sentiment distribution in text classification,a long text sentiment classification model based on hierarchical convolutional neural network(CNN)called pos-ACNN-CNN was proposed.Positional encoding was added to the embedding layer to capture the word order information in the text,the attention-based CNN was combined to recognize the emotional semantic contribution of different words,and the feature information of the sentence pair consisting of two consecutive sentences was obtained.CNN was used to extract the global features of all sentence pairs to obtain the final classification result.Results of experiments on IMDB datasets show that the emotion classification model has better classification effects.
作者
徐逸舟
林晓
陆黎明
XU Yi-zhou;LIN Xiao;LU Li-ming(College of Information,Mechanical and Electrical Engineering,Shanghai Normal University,Shanghai 200234,China)
出处
《计算机工程与设计》
北大核心
2022年第4期1121-1126,共6页
Computer Engineering and Design
基金
国家自然科学基金面上基金项目(61872242)。