摘要
BERT (Bidirectional Encoder Representation from Transformers)预训练语言模型相较于传统词向量(one-hot、word2vec等)可以动态表示词语,并且在11项下游任务中取得了最好的结果。基于BERT在特定语料中微调的方式逐渐被广泛应用于情感分析任务中并且取得了不错的效果,然而仅仅使用BERT最后一个编码层的输出特征用于分类,忽略了其他层学习到的语义特征。不同于以往基于BERT的分类模型,BERT-MLF融合了BERT每一个编码层输出的方面特征,并通过卷积层提取层之间的关键语义特征,减少冗余信息的影响,充分利用了每个编码层学习到的信息。该方法在SemEval-2014 Task 4的Laptop和Restaurant数据集上进行了大量实验,实验结果表明该方法实现了良好的分类性能。
Compared with traditional word vectors (one-hot, word2vec, etc.), BERT (Bidirectional Encoder Representation from Transformers) pretrained language model can dynamically represent words and has achieved the best results in 11 downstream tasks. The method of fine-tuning in specific corpus based on BERT has been widely used in sentiment analysis tasks and achieved good results. However, only the output features of the last coding layer of BERT are used for classification, ignoring the semantic features learned by other layers. Different from the previous classification model based on BERT, BERT-MLF integrates the aspect features of each coding layer of BERT, and extracts the key semantic features between the layers through the convolution layer to reduce the influence of redundant information and make full use of the information learned by each coding layer. A large number of experiments have been carried out on the Laptop and Restaurant datasets of SemEval-2014 Task 4, and the experimental results show that the method achieves good classification performance.
出处
《计算机科学与应用》
2020年第12期2147-2158,共12页
Computer Science and Application