期刊文献+

面向迁移学习的文本特征对齐算法 被引量:7

Transfer Learning Oriented Text Feature Alignment Algorithm
下载PDF
导出
摘要 源领域和目标领域特征空间的不一致导致迁移学习准确率下降。为此,提出一种基于Word2Vec的不同领域特征对齐算法。只选取形容词、副词、名词、动词作为特征,针对每种词性,选择源领域和目标领域的枢纽特征,分别在源领域和目标领域为该枢纽特征计算出与之语义相似度最大的非枢纽特征,将其作为相似枢纽特征,从而为每个枢纽特征构成一个相似枢纽特征对。将出现在这些领域中的每一个相似枢纽特征按照枢纽特征对进行特征替换,从而将不同领域语义相似的特征进行对齐,并在特征替换后的源领域和目标领域数据上进行机器学习。实验结果表明,该算法的平均分类精度达到88.2%,高于Baseline算法。 The inconsistency between source domain and target domain feature spaces results in accuracy decline of transfer learning.To resolve this problem,this paper proposes a different domain feature alignment method based on Word2 Vec.Adjectives,adverbs,nouns and verbs are selected as features.Pivot feature is selected from source domain and target domain for every part of speech.The most similar non-pivot feature is calculated for each pivot feature respectively from source domain and target domain as similar pivot feature.Then similar pivot feature pairs are constructed accordingly.Every similar pivot feature appearing in both domains is transformed according to similar pivot feature pairs.Consequently,the features which represent similar semantic information are aligned.Machine learning is performed on source domain and target domain data after feature transformation.Experimental result shows that the average accuracy of the proposed algorithm is 88.2%,higher than Baseline algorithm.
出处 《计算机工程》 CAS CSCD 北大核心 2017年第2期215-219,226,共6页 Computer Engineering
基金 国家自然科学基金(61572102 61562080) 大连外国语大学科研基金(2014XJQN14)
关键词 迁移学习 特征对齐 情感分析 源领域 目标领域 transfer learning feature alignment emotion analysis source domain target domain
  • 相关文献

参考文献1

二级参考文献13

  • 1Chan Kam-Tong, King I. Let's Tango: Finding the Right Couple for Feature-opinion Association in Sentiment Analysis[C]//Proc. of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Berlin, Germany: Springer-Verlag, 2009.
  • 2Somasundran S, Wiebe J, Ruppenhofer J. Discourse Level Opinion Interpretation[C]//Proc. of the 22nd International Conference on Computational Linguistics. Manchester, UK: Is. n.], 2008.
  • 3Blitzer J, Dredze M, Pereira F. Biographies, Bollywood, Boomboxes and Blenders: Domain Adaptation for SentimentClassification[C]//Proc. of the 45th Annual Meeting of the Association of Computational Linguistics. Prague, Czech Republic: [s. n.], 2007.
  • 4Blitzer J, McDonald R R, Pereira F. Domain Adaptation with Structural Correspondence Learning[C]//Proc. of Conference on Empirical Methods in Natural Language. Sydney, Australia: [s. n.], 2006.
  • 5Sinno J P, Ni Xiaochuan, Sun Jiantao, et al. Cross-domain Sentiment Classification via Spectral Feature Alignment[C]// Proc. of the 19th International Conference on World Wide Web. New York, USA: ACM Press, 2010.
  • 6Zhang Di, Xue Guirong, Yu Yong. Iterative Reinforcement Cross-domain Text Classification[C]//Proc. of the 4th Inter- national Conference on Advanced Data Mining and Appli- cations. Chengdu, China: [s. n.], 2008.
  • 7Meng Jiana, Lin Hongfei. Transfer Learning Based on Graph Ranking[C]//Proc. of the 9th International Conference on Fuzzy Systems and Knowledge Discovery. IS. 1.]: IEEE Press, 2012.
  • 8孟佳娜.迁移学习在文本分类中的应用研究[D].大连:大连理工大学,2011.
  • 9Thorsten J. Text Categorization with Support Vector Machines: Leaning with Many Relevant Features[C]//Proc. of the 10th European Conference on Machine Learning. Chemnitz, Germany: [s. n.], 1998.
  • 10Alexander G, David L D, David M. Large-scale Bayesian Logistic Regression for Text Categorization[J]. Technometrics, 2007, 49(3): 291-304.

共引文献5

同被引文献43

引证文献7

二级引证文献76

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部