期刊文献+

SentiBERT:结合情感信息的预训练语言模型 被引量:10

SentiBERT:Pre-training Language Model Combining Sentiment Information
下载PDF
导出
摘要 在大规模无监督语料上预训练的语言模型正逐渐受到自然语言处理领域研究者的关注。现有模型在预训练阶段主要提取文本的语义和结构特征,针对情感类任务的复杂情感特征,在最新的预训练语言模型BERT(双向transformers编码表示)的基础上,提出了一种侧重学习情感特征的预训练方法。在目标领域的预训练阶段,利用情感词典改进了BERT的预训练任务。同时,使用基于上下文的词粒度情感预测任务对掩盖词情感极性进行分类,获取偏向情感特征的文本表征。最后在少量标注的目标数据集上进行微调。实验结果表明,相较于原BERT模型,可将情感任务的精度提升1个百分点,特别是训练样本较少时,能取得更先进的效果。 Pre-training language models on large-scale unsupervised corpus are attracting the attention of researchers in the field of natural language processing.The existing model mainly extracts the semantic and structural features of the text in the pre-training stage.Aiming at sentiment task and complex emotional features,a pre-training method focusing on learning sentiment features is proposed on the basis of the latest pre-training language model BERT(bidirectional encoder representations from transformers).In the further pre-training stage,this paper improves pretraining task of BERT with the help of sentiment dictionary.At the same time,this paper uses context-based word sentiment prediction task to classify the sentiment of masked words to acquire the textual representation biased towards sentiment features.Finally,fine-tuning is performed on a small labeled data sets.Experimental results show that,compared with the original BERT model,the accuracy of sentiment tasks can be improved by 1 percentage point.More advanced results can be achieved at small training sets.
作者 杨晨 宋晓宁 宋威 YANG Chen;SONG Xiaoning;SONG Wei(School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi,Jiangsu 214122,China)
出处 《计算机科学与探索》 CSCD 北大核心 2020年第9期1563-1570,共8页 Journal of Frontiers of Computer Science and Technology
基金 国家重点研发子课题(No.2017YFC1601800) 国家自然科学基金(Nos.61876072,61673193) 中国博士后科学基金特助(No.2018T110441)江苏省六大人才高峰项目(No.XYDXX-012).
关键词 BERT 情感分类 预训练语言模型 多任务学习 bidirectional encoder representations from transformers(BERT) sentiment classification pre-training language models multi-task learning
  • 相关文献

参考文献4

二级参考文献12

共引文献14

同被引文献81

引证文献10

二级引证文献58

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部