摘要
【目的】研究针对跨领域情感分类任务中标注数据资源相对匮乏以及从源领域到目标领域情感分类特征重要性区分问题。【方法】提出基于特征融合表示方法与注意力机制的跨领域双向长短时记忆产品评论情感分类模型,融合Bert词向量和跨领域词向量生成跨领域统一特征空间,通过双向长短时记忆网络结合注意力机制提取全局特征和局部特征的重要性权重。【结果】在亚马逊产品公开评论数据集上的对照实验结果表明,该模型跨领域评论情感分类平均准确率达到对照模型的最高值95.93%,比文献中对照模型最高准确率高出9.33%。【局限】需在多领域大规模数据集上进一步检验模型的泛化性,探究源领域知识对目标领域评论情感分类贡献规律。【结论】通过双向长短时记忆网络层学习融合特征能够有效获取情感语义信息,对照实验中对目标领域最有帮助的源领域基本一致。
[Objective] This paper tries to address the issues of labelled data shortage, aiming to distinguish the weights of sentiment characteristics in cross-domain sentiment classification. [Methods] We proposed a sentiment classification model for cross-domain product reviews based on feature fusion representation and the attention mechanism. First, this model integrated Bert and cross-domain word vectors to generate cross-domain unified feature space. Then, it extracted the weights of global and local features through attention mechanism. [Results]We examined our model with public review data from Amazon and found the average accuracy of the proposed model was up-to 95.93%, which was 9.33% higher than the existing model. [Limitations] More research is needed to evaluate our model with large-scale multi-domain data sets. [Conclusions] The proposed model could effectively analyze sentiment information.
作者
祁瑞华
简悦
郭旭
关菁华
杨明昕
Qi Ruihua;Jian Yue;Guo Xu;Guan Jinghua;Yang Mingxin(Research Center for Language Intelligence,Dalian University of Foreign Languages,Dalian 116044,China;School of Software Engineering,Dalian University of Foreign Languages,Dalian 116044,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2020年第12期85-94,共10页
Data Analysis and Knowledge Discovery
基金
辽宁省高等学校创新人才项目(项目编号:WR2019005)
国家社会科学基金一般项目“典籍英译国外读者网上评论观点挖掘研究”(项目编号:15BYY028)
辽宁省社科规划基金一般项目“大数据环境下突发事件谣言预警研究”(项目编号:L17BTQ005)的研究成果之一。
关键词
特征融合
注意力机制
跨领域
情感分类
Feature Fusion
Attention Mechanism
Cross-Domain
Sentiment Classification