摘要
卷积神经网络(CNN)和循环神经网络(RNN)在自然语言处理上得到广泛应用,但由于自然语言在结构上存在着前后依赖关系,仅依靠卷积神经网络实现文本分类将忽略词的上下文含义,且传统的循环神经网络存在梯度消失或梯度爆炸问题,限制了文本分类的准确率。为此,提出一种卷积神经网络和双向长短时记忆(Bi LSTM)特征融合的模型,利用卷积神经网络提取文本向量的局部特征,利用Bi LSTM提取与文本上下文相关的全局特征,将两种互补模型提取的特征进行融合,解决了单卷积神经网络模型忽略词在上下文语义和语法信息的问题,也有效避免了传统循环神经网络梯度消失或梯度弥散问题。在两种数据集上进行对比实验,实验结果表明,所提特征融合模型有效提升了文本分类的准确率。
Convolutional Neural Network(CNN)and Recurrent Neural Network(RNN)are widely used in natural language processing,but the natural language has a certain dependence on the structure,only relying on CNN for text classification will ignore the contextual meaning of words,and there is a problem of gradient disappearance or gradient dispersion in the traditional RNN,which limits the accuracy of text classification.A feature fusion model for CNN and Bidirectional Long Short-Term Memory(BiLSTM)was presented.Local features of text were extracted by CNN and global features related to text were extracted by BiLSTM network.The features extracted by the two complementary models were merged to solve the problem of ignoring the contextual semantic and grammatical information of words in a single CNN model,and the fusion model also effectively avoided the problem of gradient disappearance or gradient dispersion in traditional RNN.The experimental results on two kinds of datasets show that the proposed fusion feature model can effectively improve the accuracy of text classification.
作者
李洋
董红斌
LI Yang;DONG Hongbin(College of Computer Science and Technology,Harbin Engineering University,Harbin Heilongjiang 150001,China)
出处
《计算机应用》
CSCD
北大核心
2018年第11期3075-3080,共6页
journal of Computer Applications
基金
国家自然科学基金资助项目(61472095)~~
关键词
词向量
卷积神经网络
双向长短时记忆
特征融合
文本情感分析
word vector
Convolution Neural Network(CNN)
Bidirectional Long Short-Term Memory(BiLSTM)
feature fusion
text sentiment analysis