期刊文献+

一种基于最大熵模型的加权归纳迁移学习方法 被引量:4

A Weighted Algorithm of Inductive Transfer Learning Based on Maximum Entropy Model
下载PDF
导出
摘要 传统机器学习和数据挖掘算法主要基于两个假设:训练数据集和测试数据集具有相同的特征空间和数据分布.然而在实际应用中,这两个假设却难以成立,从而导致传统的算法不再适用.迁移学习作为一种新的学习框架能有效地解决该问题.着眼于迁移学习的一个重要分支——归纳迁移学习,提出了一种基于最大熵模型的加权归纳迁移学习算法WTLME.该算法通过将已训练好的原始领域模型参数迁移到目标领域,并对目标领域实例权重进行调整,从而获得了精度较高的目标领域模型.实验结果表明了该算法的有效性. Traditional machine learning and data mining algorithms mainly assume that the training and test data must be in the same feature space and follow the same distribution. However, in real applications, the data distributions change frequently, so those two hypotheses are hence difficult to hold. In such cases, most traditional algorithms are no longer applicable, because they usually require re-collecting and re-labeling large amounts of data, which is very expensive and time consuming. As a new framework of learning, transfer learning could effectively solve this problem by transferring the knowledge learned from one or more source domains to a target domain. This paper focuses on one of the important branches in this field, namely inductive transfer learning. Therefore, a weighted algorithm of inductive transfer learning based on maximum entropy model is proposed. It transfers the parameters of model learned from the source domain to the target domain, and meanwhile adjusts the weights of instances in the target domain to obtain the model with higher accuracy. And thus it could speed up learning process and achieve domain adaptation. The experimental results show the effectiveness of this algorithm.
出处 《计算机研究与发展》 EI CSCD 北大核心 2011年第9期1722-1728,共7页 Journal of Computer Research and Development
基金 国家"九七三"重点基础研究发展计划基金项目(2009CB326203) 国家自然科学基金项目(60975034) 安徽省自然科学基金项目(090412044)
关键词 机器学习 数据挖掘 迁移学习 最大熵 归纳式 ADABOOST machine learning data mining transfer learning maximum entropy inductive AdaBoost
  • 相关文献

参考文献14

  • 1Pan S, Yang Q. A survey on transfer learning [J].IEEE Trans on Knowledge and Data Engineering, 2010, 22(10): 1345-1359.
  • 2Dai W, Yang Q, Xue G, et al. Boosting for transfer learning [C] //Proc of the 24th Int Conf on Machine Learning. New York: ACM, 2007: 193-200.
  • 3Berger A L, Pietra S, PIetra V. A maximum entropy approach to natural language processing[J]. Computational Linguistics, 1996, 22(1): 38-73.
  • 4Raina R, Battle A, Lee H, et al. Self-taught learning: Transfer learning from unlabeled data [C] //Proc of the 24th Int Conf on Machine Learning. New York: ACM, 2007: 759-766.
  • 5Blitzer J, Dredze M, Pereira F. Biographies, bollywood, boomboxes and blenders: Domain adaptation for sentiment classification [C] //Proc of the 45th Annual Meeting of the Association of Computational Linguistics. Prague, Czech Republic: ACL Press, 2007: 432-439.
  • 6Cao B, Liu N, Yang Q. Transfer learning for collective link prediction in multiple heterogenous domains [C]//Proc of the 27th Int Conf on Machine Learning. Haifa, Israel: Omnipress, 2010:159-166.
  • 7Andrew A, Ramesh N, William W C. A comparative study of methods for transductivc transfer learning [C] //Proc of the 7th IEEE Int Conf on Data Mining Workshops. Los Alamitos, CA: IEEE Computer Society, 20077 77-82.
  • 8Dai W, Xue G, Yang Q, et al. Transferring naive Bayes classifiers for text classification [C] //Proc of the 22nd AAAI Conf on Artificial Intelligence. Vancouver, British Columbia.. AAAI Press, 2007:540-545.
  • 9Daume III H, Marcu D. Domain adaptation for statistical classifiers [J].Journal of Artificial Intelligence Research, 2006, 26:101-126.
  • 10Freund Y, Schapire R E. A short introduction to boosting [J]. Journal of Japanese Society for Artificial Intelligence, 1999, 14(5)~ 771-780.

二级参考文献22

  • 1E F T K Sang, W Daelemans, H Déjean et al. Applying system combination to base noun phrase identification. In: Proc of COLING 2000. Saarbrücken, Germany: Morgan Kaufmann Publishers, 2000. 857~863
  • 2周明 .基于语料库的中文最长名词短语的自动抽取.见:计算语言进展与应用.北京,清华大学出版社,1995. 50-55(Zhou Ming. Corpus-based Chinese maximum noun phrase extraction. In: Computer Linguistic Development and Application(in Chinese). Beijing: Tsinghua University Press, 1995. 50-55)
  • 3K W Church. A stochastic parts program and noun phrase for unrestricted test. In: Proc of the 2nd Conf on Applied Natural Language Processing. Austin, TX, USA: Kluwer Academic Publishers, 1988. 136~143
  • 4S P Abney. Parsing by Chunks. In: R C Berwick, S P Abney eds. PrincipleBased Parsing: Computation and Psycholinguistics. Boston, USA: Kluwer Academic Publishers, 1991. 257~278
  • 5L A Ramshaw, M P Marcus. Text chunking using transformation-based learning. In: Proc of the 3rd Workshop on Very Large Corpora. Kluwer Academic Publishers, 1995. 82~94
  • 6A Ratnaparkhi. Learning to parse natural language with maximum entropy models. Machine Learning, 1999, 34(1/2/3): 151~176
  • 7范晓.静态短语和动态短语. 见:三个平面的语法观 .北京:北京语言文化大学出版社,1996(Fan Xiao. Static phrase and dynamic phrase. In: Grammar Concept from Three Sides(in Chinese). Beijing: Beijing Linguistic Culture College Publisher, 1996)
  • 8R Koeling. Chunking with maximum entropy models. In: Proc of CoNLL 2000. Lisbon, Portagal: Lingustic Association for Computation, 2000
  • 9A L Berger, S A D Pietra, V J D Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 1996, 22(1):39~71
  • 10A L Berger. The improved iterative scaling algorithm: A gentle introduction. School of Computer Science, Carnegin Mellon University, 1997

共引文献61

同被引文献39

  • 1Suzuki T, Sugiyama M, Kanamori T, et al. Mutual information estimation reveals global associations between stimuli and bio- logical processes[ J]. BMC Bioinformatics ,2009,10( 1 ) : S52.
  • 2Suzuki T, Sugiyama M. Suglcient dimension reduction via squared-loss mutual information estimation [ C ]//Intl. Conf. on Artificial Intelligence and Statistics. [ s. 1. ] : [ s. n. ] ,2010:804-811.
  • 3Sugiyama M, Suzuki T, Nakajima S, et al. Direct importance estimation for covariate shift adaptation [ J ]. Annals of the In- stitute of Statistical Mathematics,2008,60(4) :699-746.
  • 4Yamada M, Sugiyama M. Direct importance estimation with Gaussian mixture models [ J ]. IEICE Trans. on Information and Systems ,2009, E92-D(10) :2159-2162.
  • 5Tipping M E, Bishop C M. Mixtures of probabilistic principal component analyzers [ J ]. Neural Computation, 1999,11 ( 2 ) : 443 -482.
  • 6Bishop C M. Pattern recognition and machine learning [ M ]. New York, USA : Springer-Verlag,2006.
  • 7孟佳娜.迁移学习在文本分类中的应用研究[D].大连:大连理工大学,2011.
  • 8Kamishima T, Hamasaki M, Akaho S. TrBagg: A simple transfer learning method and its application to persona|ization in collaborative tagging [R]. USA: IEEE International Confe- rence on Data Mining, 2009: 219-228.
  • 9YANG P, GAO W. Information-theoretic multi-view domain ad- aptation: a theoretical and empirical study [ J]. Journal of Ar- tificial Intelligence Research, 2014, 49 : 501-525.
  • 10XIA R, HU X, LU J, et al. Instance selection and instance weighting for cross-domain sentiment classification via PU learn- ing [ C ] //Proceedings of the 23ra Int Joint Conf on Artificial Intelligence. Menlo Park, CA: AAAI, 2013: 2176-2182.

引证文献4

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部