摘要
针对卷积神经网络(Convolutional Neural Network,CNN)在获取文本中上下文依赖关系方面的不足及深层神经网络在提取文本特征时出现的特征丢失问题,提出一种将长短时记忆网络(Long Short-Term Memory,LSTM)与卷积神经网络结合的文本分类模型MLCNN(Merge-LSTM-CNN).首先,利用词嵌入将输入文本进行向量表示,通过三层CNN提取文本的局部特征,进而整合出全文语义;同时,使用LSTM存储文本中历史信息的特征,以获取文本的上下文关联语义;其次,将输入向量分别与各层CNN的输出相融合,实现原始特征的重用.实验结果表明,相对于CNN、LSTM以及其改进模型,MLCNN模型的分类准确率达到96.45%,取得更好的分类效果.
In viewof the defect of Convolutional Neural Network(CNN)in obtaining the context dependency relation of the text and the feature loss problem by using deep Neural Network extracts the text features,this paper proposes a text classification model called MLCNN(Merge-LSTM-CNN)that combines Long Short-Term Memory(LSTM)and Convolutional Neural Network(CNN).Firstly,the input text is represented as vector through word embedding,and the local features of the text are extracted by using three layers of CNN,then integrate the semantics of full text.At the same time,LSTMis used to store the characteristics of the historical information in text to obtain the context-related semantics of the text.Secondly,the input vector is merged with the output of each layer of CNN to reuse the original features.Experimental results show that compared with CNN,LSTMand their improved model,the classification accuracy of MLCNN model reaches 96.45%,which achieves better classification effect.
作者
王海涛
宋文
王辉
WANG Hai-tao;SONG Wen;WANG Hui(College of Computer Science and Technology,Henan Polytechnic University,Jiaozuo 454000,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2020年第6期1163-1168,共6页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61503124,61572379)资助.
关键词
文本分类
长短时记忆网络
卷积神经网络
词嵌入
融合
text classification
long short-term memory
convolution neural network
word embedding
merge