一种基于最大熵模型的加权归纳迁移学习方法被引量：4

A Weighted Algorithm of Inductive Transfer Learning Based on Maximum Entropy Model

下载PDF

导出

摘要传统机器学习和数据挖掘算法主要基于两个假设:训练数据集和测试数据集具有相同的特征空间和数据分布.然而在实际应用中,这两个假设却难以成立,从而导致传统的算法不再适用.迁移学习作为一种新的学习框架能有效地解决该问题.着眼于迁移学习的一个重要分支——归纳迁移学习,提出了一种基于最大熵模型的加权归纳迁移学习算法WTLME.该算法通过将已训练好的原始领域模型参数迁移到目标领域,并对目标领域实例权重进行调整,从而获得了精度较高的目标领域模型.实验结果表明了该算法的有效性. Traditional machine learning and data mining algorithms mainly assume that the training and test data must be in the same feature space and follow the same distribution. However, in real applications, the data distributions change frequently, so those two hypotheses are hence difficult to hold. In such cases, most traditional algorithms are no longer applicable, because they usually require re-collecting and re-labeling large amounts of data, which is very expensive and time consuming. As a new framework of learning, transfer learning could effectively solve this problem by transferring the knowledge learned from one or more source domains to a target domain. This paper focuses on one of the important branches in this field, namely inductive transfer learning. Therefore, a weighted algorithm of inductive transfer learning based on maximum entropy model is proposed. It transfers the parameters of model learned from the source domain to the target domain, and meanwhile adjusts the weights of instances in the target domain to obtain the model with higher accuracy. And thus it could speed up learning process and achieve domain adaptation. The experimental results show the effectiveness of this algorithm.

作者梅灿华张玉红胡学钢李培培

机构地区合肥工业大学计算机科学与技术系

出处《计算机研究与发展》 EI CSCD 北大核心 2011年第9期1722-1728,共7页 Journal of Computer Research and Development

基金国家"九七三"重点基础研究发展计划基金项目(2009CB326203) 国家自然科学基金项目(60975034) 安徽省自然科学基金项目(090412044)

关键词机器学习数据挖掘迁移学习最大熵归纳式 ADABOOST machine learning data mining transfer learning maximum entropy inductive AdaBoost

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献14

1Pan S, Yang Q. A survey on transfer learning [J].IEEE Trans on Knowledge and Data Engineering, 2010, 22(10): 1345-1359.
2Dai W, Yang Q, Xue G, et al. Boosting for transfer learning [C] //Proc of the 24th Int Conf on Machine Learning. New York: ACM, 2007: 193-200.
3Berger A L, Pietra S, PIetra V. A maximum entropy approach to natural language processing[J]. Computational Linguistics, 1996, 22(1): 38-73.
4Raina R, Battle A, Lee H, et al. Self-taught learning: Transfer learning from unlabeled data [C] //Proc of the 24th Int Conf on Machine Learning. New York: ACM, 2007: 759-766.
5Blitzer J, Dredze M, Pereira F. Biographies, bollywood, boomboxes and blenders: Domain adaptation for sentiment classification [C] //Proc of the 45th Annual Meeting of the Association of Computational Linguistics. Prague, Czech Republic: ACL Press, 2007: 432-439.
6Cao B, Liu N, Yang Q. Transfer learning for collective link prediction in multiple heterogenous domains [C]//Proc of the 27th Int Conf on Machine Learning. Haifa, Israel: Omnipress, 2010:159-166.
7Andrew A, Ramesh N, William W C. A comparative study of methods for transductivc transfer learning [C] //Proc of the 7th IEEE Int Conf on Data Mining Workshops. Los Alamitos, CA: IEEE Computer Society, 20077 77-82.
8Dai W, Xue G, Yang Q, et al. Transferring naive Bayes classifiers for text classification [C] //Proc of the 22nd AAAI Conf on Artificial Intelligence. Vancouver, British Columbia.. AAAI Press, 2007:540-545.
9Daume III H, Marcu D. Domain adaptation for statistical classifiers [J].Journal of Artificial Intelligence Research, 2006, 26:101-126.
10Freund Y, Schapire R E. A short introduction to boosting [J]. Journal of Japanese Society for Artificial Intelligence, 1999, 14(5)~ 771-780.

二级参考文献22

1E F T K Sang, W Daelemans, H Déjean et al. Applying system combination to base noun phrase identification. In: Proc of COLING 2000. Saarbrücken, Germany: Morgan Kaufmann Publishers, 2000. 857～863
2周明 .基于语料库的中文最长名词短语的自动抽取.见:计算语言进展与应用.北京,清华大学出版社,1995. 50-55(Zhou Ming. Corpus-based Chinese maximum noun phrase extraction. In: Computer Linguistic Development and Application(in Chinese). Beijing: Tsinghua University Press, 1995. 50-55)
3K W Church. A stochastic parts program and noun phrase for unrestricted test. In: Proc of the 2nd Conf on Applied Natural Language Processing. Austin, TX, USA: Kluwer Academic Publishers, 1988. 136～143
4S P Abney. Parsing by Chunks. In: R C Berwick, S P Abney eds. PrincipleBased Parsing: Computation and Psycholinguistics. Boston, USA: Kluwer Academic Publishers, 1991. 257～278
5L A Ramshaw, M P Marcus. Text chunking using transformation-based learning. In: Proc of the 3rd Workshop on Very Large Corpora. Kluwer Academic Publishers, 1995. 82～94
6A Ratnaparkhi. Learning to parse natural language with maximum entropy models. Machine Learning, 1999, 34(1/2/3): 151～176
7范晓.静态短语和动态短语. 见:三个平面的语法观 .北京:北京语言文化大学出版社,1996(Fan Xiao. Static phrase and dynamic phrase. In: Grammar Concept from Three Sides(in Chinese). Beijing: Beijing Linguistic Culture College Publisher, 1996)
8R Koeling. Chunking with maximum entropy models. In: Proc of CoNLL 2000. Lisbon, Portagal: Lingustic Association for Computation, 2000
9A L Berger, S A D Pietra, V J D Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 1996, 22(1):39～71
10A L Berger. The improved iterative scaling algorithm: A gentle introduction. School of Computer Science, Carnegin Mellon University, 1997

共引文献61

1李剑锋,胡国平,王仁华.基于最大熵模型的韵律短语边界预测[J].中文信息学报,2004,18(5):56-63. 被引量：20
2陈晓明,周渝.汉语部分句法分析的研究和发展趋势[J].贵州大学学报（自然科学版）,2004,21(4):384-386. 被引量：2
3干俊伟,黄德根.汉语介词短语的自动识别[J].中文信息学报,2005,19(4):17-23. 被引量：14
4王建会,王雷,胡运发.词语间依存关系的定量识别[J].中文信息学报,2005,19(4):31-38. 被引量：3
5冯丽萍,焦莉娟.基于最大熵的中文组织机构名识别模型[J].计算机与数字工程,2010,38(12):36-40. 被引量：2
6余正涛,樊孝忠.基于最大熵模型的汉语问句语义组块分析[J].计算机工程,2005,31(17):3-5. 被引量：5
7冯冲,陈肇雄,黄河燕,王江伟.最大熵模型的树-栅格最优N解码算法[J].计算机科学,2005,32(10):167-169. 被引量：1
8张仰森,曹元大,俞士汶.最大熵方法中特征选择算法的改进与纠错排歧[J].北京理工大学学报,2006,26(1):36-40. 被引量：4
9周雅倩,黄萱菁,吴立德.一种特征匹配方法:稀疏特征树[J].软件学报,2006,17(5):1026-1033. 被引量：1
10刘贵全,曾宇斌.基于最大熵模型的汉语依存分析[J].计算机工程,2006,32(11):216-218. 被引量：2

同被引文献39

1Suzuki T, Sugiyama M, Kanamori T, et al. Mutual information estimation reveals global associations between stimuli and bio- logical processes[ J]. BMC Bioinformatics ,2009,10( 1 ) : S52.
2Suzuki T, Sugiyama M. Suglcient dimension reduction via squared-loss mutual information estimation [ C ]//Intl. Conf. on Artificial Intelligence and Statistics. [ s. 1. ] : [ s. n. ] ,2010:804-811.
3Sugiyama M, Suzuki T, Nakajima S, et al. Direct importance estimation for covariate shift adaptation [ J ]. Annals of the In- stitute of Statistical Mathematics,2008,60(4) :699-746.
4Yamada M, Sugiyama M. Direct importance estimation with Gaussian mixture models [ J ]. IEICE Trans. on Information and Systems ,2009, E92-D(10) :2159-2162.
5Tipping M E, Bishop C M. Mixtures of probabilistic principal component analyzers [ J ]. Neural Computation, 1999,11 ( 2 ) : 443 -482.
6Bishop C M. Pattern recognition and machine learning [ M ]. New York, USA : Springer-Verlag,2006.
7孟佳娜.迁移学习在文本分类中的应用研究[D].大连:大连理工大学,2011.
8Kamishima T, Hamasaki M, Akaho S. TrBagg: A simple transfer learning method and its application to persona|ization in collaborative tagging [R]. USA: IEEE International Confe- rence on Data Mining, 2009: 219-228.
9YANG P, GAO W. Information-theoretic multi-view domain ad- aptation: a theoretical and empirical study [ J]. Journal of Ar- tificial Intelligence Research, 2014, 49 : 501-525.
10XIA R, HU X, LU J, et al. Instance selection and instance weighting for cross-domain sentiment classification via PU learn- ing [ C ] //Proceedings of the 23ra Int Joint Conf on Artificial Intelligence. Menlo Park, CA: AAAI, 2013: 2176-2182.

引证文献4

1兰远东,邓辉舫.基于Kullback-Leibler与PCA的概率密度比值估计[J].计算机技术与发展,2012,22(6):107-110.
2吴陈,汤莹.基于选择迁移的bagging文本分类算法[J].计算机工程与设计,2015,36(7):1808-1812. 被引量：4
3张莹,高慧颖,巴志超.基于主题关联挖掘的跨类型数字资源分类方法[J].情报理论与实践,2015,38(11):108-114. 被引量：3
4刘宇廷,倪颖杰.融合知识迁移学习的微博社团检测模型构建[J].计算机技术与发展,2018,28(9):11-15.

二级引证文献7

1张伟,简刚.基于不均衡文本数据的集成分类方法设计[J].电信技术研究,2018,0(4):55-64.
2龚静,黄欣阳.文档分类中的多特征最大值法及其改进方法[J].计算机工程与设计,2017,38(8):2262-2268.
3巴志超,李湘东,马亚雪,徐健.基于主题语义扩展的混合类型数字资源分类研究[J].情报理论与实践,2018,41(5):143-149. 被引量：3
4刘鑫童,刘立波,张鹏.基于多重卷积神经网络跨数据集图像分类[J].计算机工程与设计,2018,39(11):3549-3554. 被引量：5
5刘毓,刘陆,高尹,杨柳.基于属性相容性的随机森林算法[J].西安邮电大学学报,2019,24(3):71-75. 被引量：1
6王金婉,朱学芳.迁移学习在信息资源开发及服务中的应用探索[J].情报理论与实践,2021,44(7):145-151. 被引量：1
7傅浩,刘姗姗,李佳蔚,张志政,张慧.军事事件主题检测与抽取方法[J].指挥信息系统与技术,2024,15(1):76-81. 被引量：1

1徐文龙,姚立红,潘理,倪佑生.基于TSVM的网络入侵检测研究[J].计算机工程,2006,32(18):138-140. 被引量：5
2林卫星.单片机应用系统的软硬件开发[J].工业控制计算机,2002,15(9):58-60. 被引量：5
3罗海霞,冯剑琳.一种基于排列融合的归纳式半监督排序方法[J].计算机研究与发展,2011,48(S3):189-196.
4高克寒,张素明,王晓林,安雪岩.基于改进归纳式监控算法的液体火箭发动机实时故障检测[J].航空动力学报,2016,31(10):2554-2560. 被引量：8
5侯翠琴,焦李成.基于图的Co-Training网页分类[J].电子学报,2009,37(10):2173-2180. 被引量：9
6王自营,邱绵浩,安钢.增量学习直推式支持向量机及其在旋转机械状态判别中的应用[J].中国电机工程学报,2008,28(32):89-95.
7孟佳娜,赵丹丹,于玉海,孙世昶.归纳式迁移学习在跨领域情感倾向性分析中的应用[J].南京大学学报（自然科学版）,2016,52(1):175-183. 被引量：2
8庄福振,罗平,何清,史忠植.基于混合正则化的无标签领域的归纳迁移学习[J].科学通报,2009,54(11):1618-1625. 被引量：5
9陈毅松,汪国平,董士海.基于支持向量机的渐进直推式分类学习算法[J].软件学报,2003,14(3):451-460. 被引量：88
10徐雪,周荷琴.局部线性调和下的归纳式半监督分类算法[J].小型微型计算机系统,2010,31(7):1470-1472. 被引量：1

计算机研究与发展

2011年第9期

浏览历史

内容加载中请稍等...

一种基于最大熵模型的加权归纳迁移学习方法被引量：4

参考文献14

二级参考文献22

共引文献61

同被引文献39

引证文献4

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

一种基于最大熵模型的加权归纳迁移学习方法 被引量：4

参考文献14

二级参考文献22

共引文献61

同被引文献39

引证文献4

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

一种基于最大熵模型的加权归纳迁移学习方法被引量：4