期刊文献+

用于短文本分类的DC-BiGRU_CNN模型 被引量:15

DC-BiGRU_CNN Model for Short-text Classification
下载PDF
导出
摘要 文本分类是自然语言处理中一项比较基础的任务,如今深度学习技术被广泛用于处理文本分类任务。在处理文本序列时,卷积神经网络可以提取局部特征,循环神经网络可以提取全局特征,它们都表现出了不错的效果。但是,卷积神经网络不能很好地捕获文本的上下文相关语义信息,循环神经网路对语义的关键信息不敏感。另外,利用更深层次的网络虽然可以更好地提取特征,但是容易产生梯度消失或梯度爆炸问题。针对以上问题,文中提出了一种基于密集连接循环门控单元卷积网络的混合模型(DC-BiGRU_CNN)。该模型首先用一个标准的卷积神经网络训练出字符级词向量,然后将其与词级词向量进行拼接并作为网络输入层。受密集连接卷积网络的启发,在对文本进行高级语义建模阶段时,采用文中提出的密集连接双向门控循环单元,其可以弥补梯度消失或梯度爆炸的缺陷,并且加强了每一层特征之间的传递,实现了特征复用;对前面提取的深层高级语义表示进行卷积和池化操作以获得最终的语义特征表示,再将其输入到softmax层,实现对文本的分类。在多个公开数据集上的研究结果表明,DC-BiGRU_CNN模型在执行文本分类任务时准确率有显著提升。此外,通过实验分析了模型的不同部件对性能提升的作用,研究了句子的最大长度值、网络的层数、卷积核的大小等参数对模型效果的影响。 Text classification is a basic task in natural language processing.Nowadays,it is more and more popular to use deep learning technology to deal with text classification tasks.When processing text sequences,convolutional neural networks can extract local features,and recurrent neural networks can extract global features,all of which show good effect.However,convolutional neural networks can not capture the context-related semantic information of text very well,and recurrent networks are not sensitive to the key semantic information.In addition,although deeper networks can better extract features,they are prone to gradient disappearance or gradient explosion.To solve these problems,this paper proposed a hybrid model based on densely connected gated recurrent unit convolutional networks(DC-BiGRU_CNN).Firstly,a standard convolutional neural network is used to train the character-level word vector,and then the character-level word vector is spliced with the word-level word vector to form the network input layer.Inspired by the densely connected convolutional network,a proposed densely connected bidirectional gated recurrent unit is used in the stage of high-level semantic modeling of text,which can alleviate the defect of gradient disappearance or gradient explosion and enhance the transfer between features of each layer,thus achieving feature reuse.Next,the convolution and pooling operation are conducted for the deep high-level semantic representation to obtain the final semantic feature representation,which is then input to the softmax layer to complete text classification task.The experimental results on several public datasets show that DC-BiGRU_CNN has a significant performance improvement in terms of the accuracy for text classification tasks.In addition,this paper analyzed the effect of different components of the model on perfor-mance improvement,and studied the effect of parameters such as the maximum length of sentence,the number of layers of the network and the size of the convolution kernel on the model.
作者 郑诚 薛满意 洪彤彤 宋飞豹 ZHENG Cheng;XUE Man-yi;HONG Tong-tong;SONG Fei-bao(School of Computer Science and Technology,Anhui University,Hefei 230601,China;Key Laboratory of Intelligent Computing&Signal Processing,Ministry of Education,Hefei 230601,China)
出处 《计算机科学》 CSCD 北大核心 2019年第11期186-192,共7页 Computer Science
关键词 字符级词向量 双向门控循环单元 密集连接 卷积神经网络 文本分类 Character-level word vector Bi-directional gated recurrent unit Dense connection Convolutional neural network Text classification
  • 相关文献

参考文献1

二级参考文献19

  • 1Pang B, Lee L. Seeing stars: Exploiting class relation- ships for sentiment categorization with respect to rating scales[C]//Proceedings o~ the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2005: 115-124.
  • 2LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition [C]//Pro- ceedings of the IEEE, 1998, 86(11) : 2278-2324.
  • 3Yih W, He X, Meek C. Semantic parsing for single-rela- tion question answering[C]//Proceedings of ACL 2014.
  • 4Shen Y, He X, Gao J, et al. Learning semantic repre- sentations using convolutional neural networks for web search[C]//Proceedings of the companion publication of the 23rd international conference on World wide web companion. International World Wide Web Confer- ences Steering Committee, 2014: 373-374.
  • 5Blunsom P, Grefenstette E, Kalehbrenner N. A conv- olutional neural network for modelling sentences[C]// Proceedings of the 52nd Annual Meeting of the Associ- ation for Computational Linguistics. 2014.
  • 6Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J].The Journal of Ma- chine Learning Research, 2011, 12: 2493-2537.
  • 7dos Santos C N, Gatti M. Deep convolutional neural networks for sentiment analysis of short texts[C]// Proceedings of the 25th International Conference on Computational Linguistics (COLING). Dublin, Ire-land. 2014.
  • 8Kim Y. Convolutional neural networks for sentence classification[C]//Proceedings of the EMNLP,2014.
  • 9Turney P D. Thumbs up or thumbs down? : semantic orientation applied to Unsupervised classification of reviews[C]//Proceedings of the 40th annual meeting on association for computational linguistics. Associa- tion for Computational Linguistics, 2002: 417-424.
  • 10Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks [C]//Advances in neural information processing sys- tems. 2012: 1097-1105.

共引文献94

同被引文献130

引证文献15

二级引证文献93

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部