摘要
单词级别的浅层卷积神经网络(CNN)模型在文本分类任务上取得了良好的表现.然而,浅层CNN模型由于无法捕捉长距离依赖关系,影响了模型在文本分类任务上的效果.简单地加深模型层数并不能提升模型的效果.本文提出一种新的单词级别的文本分类模型Word-CNN-Att,该模型使用CNN捕捉局部特征和位置信息,利用自注意力机制捕捉长距离依赖.在AGNews、DBPedia、Yelp Review Polarity、Yelp Review Full、Yahoo! Answers等5个公开的数据集上,Word-CNN-Att比单词级别的浅层CNN模型的准确率分别提高了0.9%、0.2%、0.5%、2.1%、2.0%.
The word-level shallow convolutional neural network(CNN) model has achieved good performance in text classification tasks. However, shallow CNN models can’t capture long-range dependencies, which affects the model’s performance in text classification tasks, but simply deepening the number of layers of the model does not improve the model’s performance. This paper proposes a new word-level text classification model Word-CNN-Att, which uses CNN to capture local features and position information, and captures long-range dependencies with self-attention mechanism. The accuracy of Word-CNN-Att, on the five public datasets of AGNews, DBPedia, Yelp Review Polarity, Yelp Review Full, Yahoo! Answers, is 0.9%, 0.2%, 0.5%, 2.1%, and 2.0% higher than the word-level shallow CNN model respectively.
作者
汪嘉伟
杨煦晨
琚生根
袁宵
谢正文
WANG Jia-Wei;YANG Xu-Chen;JU Sheng-Gen;YUAN Xiao;XIE Zheng-Wen(College of Computer Science,Sichuan University,Chengdu 610065,China)
出处
《四川大学学报(自然科学版)》
CAS
CSCD
北大核心
2020年第3期469-475,共7页
Journal of Sichuan University(Natural Science Edition)
基金
2018年四川省新一代人工智能重大专项科技项目(2018GZDZX0039)。
关键词
文本分类
卷积神经网络
自注意力机制
长距离依赖
Text classification
Convolutional neural network
Self-attentionmechanism
Long-range dependencies