摘要
源领域和目标领域特征空间的不一致导致迁移学习准确率下降。为此,提出一种基于Word2Vec的不同领域特征对齐算法。只选取形容词、副词、名词、动词作为特征,针对每种词性,选择源领域和目标领域的枢纽特征,分别在源领域和目标领域为该枢纽特征计算出与之语义相似度最大的非枢纽特征,将其作为相似枢纽特征,从而为每个枢纽特征构成一个相似枢纽特征对。将出现在这些领域中的每一个相似枢纽特征按照枢纽特征对进行特征替换,从而将不同领域语义相似的特征进行对齐,并在特征替换后的源领域和目标领域数据上进行机器学习。实验结果表明,该算法的平均分类精度达到88.2%,高于Baseline算法。
The inconsistency between source domain and target domain feature spaces results in accuracy decline of transfer learning.To resolve this problem,this paper proposes a different domain feature alignment method based on Word2 Vec.Adjectives,adverbs,nouns and verbs are selected as features.Pivot feature is selected from source domain and target domain for every part of speech.The most similar non-pivot feature is calculated for each pivot feature respectively from source domain and target domain as similar pivot feature.Then similar pivot feature pairs are constructed accordingly.Every similar pivot feature appearing in both domains is transformed according to similar pivot feature pairs.Consequently,the features which represent similar semantic information are aligned.Machine learning is performed on source domain and target domain data after feature transformation.Experimental result shows that the average accuracy of the proposed algorithm is 88.2%,higher than Baseline algorithm.
出处
《计算机工程》
CAS
CSCD
北大核心
2017年第2期215-219,226,共6页
Computer Engineering
基金
国家自然科学基金(61572102
61562080)
大连外国语大学科研基金(2014XJQN14)
关键词
迁移学习
特征对齐
情感分析
源领域
目标领域
transfer learning
feature alignment
emotion analysis
source domain
target domain