摘要
针对Word2vec等静态词向量模型对于每个词只有唯一的词向量表示,无法学习在不同上下文中的词汇多义性问题,提出一种基于动态词向量和注意力机制的文本情感分类方法。在大型语料库上利用深度双向语言模型预训练通用词向量;在情感分类任务的训练语料上对向量模型进行微调,得到最终的上下文相关的动态词向量作为输入特征;搭建双向长短期记忆网络模型,并引入注意力机制以提高特征提取的准确性。实验结果表明,该方法在IMDB和Yelp13数据集上的分类准确率分别提高了0.017和0.011。
Concerning the problem that static word vector models such as Word2vec have only one word vector representation for each word and cannot learn lexical polysemy in different contexts,a text emotion classification method based on dynamic word vector and attention mechanism is proposed.A general word vector was pre-trained using deep two-way language model in large corpus;the vector model was fine-tuned in the training corpus of emotional classification task,and the final context-related dynamic word vector was obtained as input feature;a bidirectional Long Short-Term Memory(biLSTM)model was built,and attention mechanism was introduced to improve the completeness of feature extraction.The experimental results show that the proposed method improves the classification accuracy of IMDB and Yelp13 datasets by 0.017 and 0.011 respectively.
作者
王璐琳
马力
Wang Lulin;Ma Li(School of Computer,Xi’an University of Posts and Telecommunications,Xi’an 710061,Shaanxi,China)
出处
《计算机应用与软件》
北大核心
2021年第5期164-169,182,共7页
Computer Applications and Software
基金
国家自然科学基金项目(61373116)
陕西省自然科学基金项目(2016JM6085)。