摘要
在自然语言处理领域,句子表示方法能捕捉文本的不同信息,如卷积神经网络捕捉短语信息,循环神经网络捕捉时序信息等.自我注意力机制能够描述任意词对之间的重要程度,但是缺少词语间相对位置信息.我们提出了专注于相对位置的自我注意力模型(Relative Positional Self-Attention Network,RPSAN).在模型中,我们设计了远距离屏蔽矩阵,通过屏蔽词语相对距离较远的自我注意力值,来提取句子的局部信息.另外,我们设计了一种新的融合机制,通过softmax函数整合不同的句子表示来减少模型复杂度.实验表明,相比于其他基于注意力机制的模型,我们的模型在斯坦福情感分析数据集(Stanford Sentiment Treebank,SST)上有着最优的表现、最低的训练成本,并且在另外四个公开的文本分类数据集上获得了最优的分类正确率.
Sentence representation methods can capture various information about sentence in the field of Natural Language Processing(NLP),such as Convolutional Neural Network(CNN)captures phrase information and Recurrent Neural Network(CNN)captures order information.The self-attention mechanism extracts the importance between word pairs without the limitation of distance,but lacks of relative positional information between words.We propose a Relative Positional Self-Attention Network(RPSAN).We presented the Distant Mask to extract the local information by avoiding the self-attention value of the distant word.In addition,we designed a new Fusion Mechanism to merge different sentence representations through the softmax function,which reduced the complexity of the model.Empirical studies show that,compared to other attention-based models,our RPSAN has the state-of-the-art performance and lowest training cost on the Stanford Sentiment Treebank(SST)dataset,and RPSAN obtains the best test accuracies on other four open sentence classification datasets.
作者
徐若易
李金龙
XU Ruo-yi;LI Jin-long(Department of Computer Science and Technology,University of Science and Technology of China,Hefei 230027,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2020年第2期225-229,共5页
Journal of Chinese Computer Systems
基金
国家自然科学基金面上项目(61573328)资助。
关键词
自我注意力机制
句子表示
融合机制
文本分类
self-attention mechanism
sentence representation
fusion mechanism
text classification