摘要
随着人们生活质量的提高,对服务质量的要求日益提高,用户对商家评论从简单的好坏判断提升到了具体方面,而传统的情感分类方法无法解决这个问题,其中,文本的情感方面提取是关键。现有模型大多使用分段式子任务来进行训练,为了解决模型在多个子任务中错误传播的问题,并提高对数据的深层学习,提出一种基于BERT的互学习网格标签的方面情感分析模型。首先,利用旋转位置编码来加强模型对位置的敏感度,同时采用降维方式对词对进行标注,以提升学习效率。然后,将两个基于位置编码的网格标签进行互学习,使模型具有更好的泛化能力。最后,为了进一步发挥互学习的效果,提出了两种数据扩充方法:拼接法和移花接木法,使模型的性能得到进一步提升。在4个标准数据集上测试了三元组提取和二元组提取任务,在F 1值上平均提升了4%以上,最高提升了7%。实验结果表明,IGTS-BERT模型在情感词提取上表现出优越的性能。
With the improvement of people’s quality of life,the requirements for service quality have become higher.The judgment of merchants on users’reviews has been improved from good or bad to specific aspects,while traditional sentiment classification is difficult to complete this task,where sentiment word extraction of text is the key.Most of the existing models use segmented subtasks for training.In order to solve the problem of model error propagation in multiple subtasks and improve the deep learning of data,an aspect sentiment analysis model based on interactive grid tagging scheme of bidirectional encoder representation from transformers(BERT)(IGTS-BERT)was proposed.Firstly,the rotation position encoding was used to enhance the model’s sensitivity to position,and the word pairs were labeled with a low-dimensional method to improve the learning efficiency.Then,the two grid tagging networks based on the rotational position were made to interact,giving the model a better generalization ability.Finally,in order to further exert the effect of interacting,two data augmentation methods were proposed:splicing method and a method about random matching of aspect words and sentiment words,which further improves the performance of the model.The Triplets extraction and Pairs extraction tasks were tested on 4 standard datasets,with an average improvement of more than 4%in F1 value and a maximum improvement of 7%.Experimental results show that the IGTS-BERT model exhibits superior performance on sentiment word extraction.
作者
王伟
李婷
葛洪伟
WANG Wei;LI Ting;GE Hongwei(School of Artificial Intelligence and Computer,Jiangnan University,Wuxi 214122,China)
基金
National Natural Science Foundation of China(No.61806006)
Jiangsu University Advantage Discipline Construction Project Funding Project。
关键词
方面情感词提取
网格标签
旋转位置编码
互学习
数据扩充
sentiment word extraction
grid tagging scheme(GTS)
rotary position embedding(RoPE)
interactive learning
data augmentation