摘要
提出一种多层网络H-RNN-CNN,用于处理中文文本情感分类任务。将文本按句子进行划分,引入句子层作为中间层,以改善文本过长带来的信息丢失等问题。模型中使用循环神经网络建模词语序列和句子序列,并通过卷积神经网络识别跨语句的信息。探讨循环神经网络变种和不同输入向量对模型的影响。实验结果表明,所提方法在多类数据集上都取得良好的效果。
The authors present a hierarchical neural network H-RNN-CNN as a general model to represent text in sentiment analysis. Firstly, since information may lose in long text, the authors divide the text by sentence and use them as middle layer. Secondly, recurrent neural network is used to process sequence and relationship across sentences is captured by convolutional neural network. Moreover, the effectiveness of the variants of recurrent neural network and the pre-trained embedding are discussed. Experiment results demonstrate that the approach works well on several datasets.
作者
罗帆
王厚峰
LUO Fan,WANG Houfeng(Institute of Computational Linguistics, Peking University, Beijing 10087)
出处
《北京大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2018年第3期459-465,共7页
Acta Scientiarum Naturalium Universitatis Pekinensis
基金
国家社会科学基金(12&ZD227)
863计划(2015AA015402)资助
关键词
中文情感分类
深度学习
卷积神经网络
循环神经网络
sentiment classification
deep learning
convolutional neural network
recurrent neural network