期刊文献+

一种利用知识迁移的卷积神经网络训练策略 被引量:4

Convolutional neural network training strategy using knowledge transfer
原文传递
导出
摘要 针对深层卷积神经网络在有限标记样本下训练时存在的过拟合和梯度弥散问题,提出一种从源模型中迁移知识训练一个深层目标模型的策略.迁移的知识包括样本的类别分布和源模型的低层特征,类别分布提供了样本的类间相关信息,扩展了训练集的监督信息,可以缓解样本不足的问题;低层特征包含样本的局部特征,在相关任务的迁移过程中具有一般性,可以使目标模型跳出局部最小值区域.利用这两部分知识对目标模型进行预训练,能够使模型收敛到较好的位置,之后再用真实标记样本进行微调.实验结果表明,所提方法能够增强模型的抗过拟合能力,并提升预测精度. To overcome the overfitting and gradient vanishing of deep convolutional neural networks trained under limited labeled samples, a strategy is proposed to transfer knowledge from a source model to a deep target model. The transferred knowledge includes class distribution of the samples and low-level features of the source model. The class distribution provides class-related information about the samples, which extends the supervised informations of the training set to alleviate the problem of inadequate samples. The low-level feature contains the local characteristics of the samples, which is general in the process of transfer knowledge, and can make the target model jump out of the local minimum value area.Then, the two parts of knowledge are applied to the pre-training target model to make the model converge to a better position, and real labeled samples are used for fine-tuning. The experimental results show that the proposed method can both improve the anti overfitting ability of the model and prediction accuracy.
作者 罗可 周安众 罗潇 LUO Ke;ZHOU An-zhong;LUO Xiao(College of Computer and Communication Engineering,Changsha University of Science and Technology,Changsha410114,China)
出处 《控制与决策》 EI CSCD 北大核心 2019年第3期511-518,共8页 Control and Decision
基金 国家自然科学基金项目(11671125 71371065 51707013)
关键词 卷积神经网络 知识迁移 过拟合 梯度弥散 预训练 微调 convolutional neural network knowledge transfer overfitting gradient vanishing pre-training fine-tuning
  • 相关文献

参考文献3

二级参考文献89

  • 1陈德品.基于迁移学习的跨领域排序学习算法研究[D].中国科学技术大学.2010
  • 2Ben-David S,Blitzer J,Crammer K,Pereira F.Analysis of representations for domain adaptation.In:Platt JC,Koller D,Singer Y,Roweis ST,eds.Proc.of the Advances in Neural Information Processing Systems 19.Cambridge:MIT Press,2007.137-144.
  • 3Blitzer J,McDonald R,Pereira F.Domain adaptation with structural correspondence learning.In:Jurafsky D,Gaussier E,eds.Proc.of the Int’l Conf.on Empirical Methods in Natural Language Processing.Stroudsburg PA:ACL,2006.120-128.
  • 4Dai WY,Xue GR,Yang Q,Yu Y.Co-Clustering based classification for out-of-domain documents.In:Proc.of the 13th ACM Int’l Conf.on Knowledge Discovery and Data Mining.New York:ACM Press,2007.210-219.[doi:10.1145/1281192.1281218].
  • 5Dai WY,Xue GR,Yang Q,Yu Y.Transferring naive Bayes classifiers for text classification.In:Proc.of the 22nd Conf.on Artificial Intelligence.AAAI Press,2007.540-545.
  • 6Liao XJ,Xue Y,Carin L.Logistic regression with an auxiliary data source.In:Proc.of the 22nd lnt*I Conf.on Machine Learning.San Francisco:Morgan Kaufmann Publishers,2005.505-512.[doi:10.1145/1102351.1102415].
  • 7Xing DK,Dai WY,Xue GR,Yu Y.Bridged refinement for transfer learning.In:Proc.of the Ilth European Conf.on Practice of Knowledge Discovery in Databases.Berlin:Springer-Verlag,2007.324-335.[doi:10.1007/978-3-540-74976-9_31].
  • 8Mahmud MMH.On universal transfer learning.In:Proc.of the 18th Int’l Conf.on Algorithmic Learning Theory.Sendai,2007.135-149.[doi:10,1007/978-3-540-75225-7_14].
  • 9Samarth S,Sylvian R.Cross domain knowledge transfer using structured representations.In:Proc.of the 21st Conf.on Artificial Intelligence.AAAI Press,2006.506-511.
  • 10Bel N,Koster CHA,Villegas M.Cross-Lingual text categorization.In:Proc.of the European Conf.on Digital Libraries.Berlin:Springer-Verlag,2003.126-139.[doi:10.1007/978-3-540-45175-4_13].

共引文献475

同被引文献43

引证文献4

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部