摘要
基于卷积神经网络(CNN)的方法在情感分类任务中得到广泛应用,该方法使用词向量作为网络的输入,但是,在卷积过程中每个词向量只能表征一个单词,并不蕴含上下文信息,导致了信息传递连续性的降低。为此,构建一种基于词语邻近特征的CNN模型,在卷积过程中让每个词向量携带邻近词语的特征,既保证信息传递的连续性,也保证词向量在局部范围内的序列性。实验结果表明,在COAE2014(二分类)和COAE2015(三分类)的情感分类任务上,该模型的准确率分别达到89.43%和85.61%,验证了其可行性和高效性。
The method based on Convolutional Neural Network( CNN) is widely used in the task of sentiment classification. This method mainly uses word vectors as the input of the network. However,in the convolution process,each word vector can only represent one word and does not contain context information,which reduces the continuity of information transmission. Aiming at this problem,a CNN model based on word adjacent feature is proposed,which allows each word vector to carry the characteristics of its adjacent words in the convolution process,which not only guarantees continuity of information transmission,but also ensures the sequentiality of word vectors in local range. The experimental results show that the accuracy rate of the proposed method is 89. 43% and 85. 61% respectively in the COAE2014( 2 classification) and COAE2015( 3 classification) emotional classification tasks,which verifies the feasibility and efficiency of the model.
作者
吕超
杨超
李仁发
LU Chao;YANG Chao;LI Renfa(School of Information Science and Engineering, Hunan University, Changsha 410082, China)
出处
《计算机工程》
CAS
CSCD
北大核心
2018年第5期182-187,共6页
Computer Engineering
基金
国家自然科学基金(61173036)
国家高技术研究发展计划项目(2012AA01A301-01)
湖南省科技计划项目(2015GK3015)
关键词
情感分类
卷积神经网络
词向量
连续性
序列性
邻近特征
sentiment classification
Convolutional Neural Network (CNN)
word vector
continuity
sequentiality
adjacent feature