期刊文献+

基于小样本无梯度学习的卷积结构预训练模型性能优化方法 被引量:1

Derivative-free few-shot learning based performance optimization method of pre-trained models with convolution structure
下载PDF
导出
摘要 针对卷积结构的深度学习模型在小样本学习场景中泛化性能较差的问题,以AlexNet和ResNet为例,提出一种基于小样本无梯度学习的卷积结构预训练模型的性能优化方法。首先基于因果干预对样本数据进行调制,由非时序数据生成序列数据,并基于协整检验从数据分布平稳性的角度对预训练模型进行定向修剪;然后基于资本资产定价模型(CAPM)以及最优传输理论,在预训练模型中间输出过程中进行无需梯度传播的正向学习并构建一种全新的结构,从而生成在分布空间中具有明确类间区分性的表征向量;最后基于自注意力机制对生成的有效特征进行自适应加权处理,并在全连接层对特征进行聚合,从而生成具有弱相关性的embedding向量。实验结果表明所提出的方法能够使AlexNet和ResNet卷积结构预训练模型在ImageNet 2012数据集的100类图片上的Top-1准确率分别从58.82%、78.51%提升到68.50%、85.72%,可见所提方法能够基于小样本训练数据有效提高卷积结构预训练模型的性能。 Deep learning model with convolution structure has poor generalization performance in few-shot learning scenarios.Therefore,with AlexNet and ResNet as examples,a derivative-free few-shot learning based performance optimization method of convolution structured pre-trained models was proposed.Firstly,the sample data were modulated to generate the series data from the non-series data based on causal intervention,and the pre-trained model was pruned directly based on the co-integration test from the perspective of data distribution stability.Then,based on Capital Asset Pricing Model(CAPM)and optimal transmission theory,in the intermediate output process of the pre-trained model,the forward learning without gradient propagation was carried out,and a new structure was constructed,thereby generating the representation vectors with clear inter-class distinguishability in the distribution space.Finally,the generated effective features were adaptively weighted based on the self-attention mechanism,and the features were aggregated in the fully connected layer to generate the embedding vectors with weak correlation.Experimental results indicate that the proposed method can increase the Top-1 accuracies of the AlexNet and ResNet convolution structured pre-trained models on 100 classes of images in ImageNet 2012 dataset from 58.82%,78.51%to 68.50%,85.72%,respectively.Therefore,the proposed method can effectively improve the performance of convolution structured pre-trained models based on few-shot training data.
作者 李亚鸣 邢凯 邓洪武 王志勇 胡璇 LI Yaming;XING Kai;DENG Hongwu;WANG Zhiyong;HU Xuan(School of Computer Science and Technology,University of Science and Technology of China,Hefei Anhui 230027,China;Suzhou Institute for Advanced Research,University of Science and Technology of China,Suzhou Jiangsu 215123,China)
出处 《计算机应用》 CSCD 北大核心 2022年第2期365-374,共10页 journal of Computer Applications
关键词 资本资产定价模型 Wasserstein距离 无梯度学习 自注意力机制 预训练模型 Capital Asset Pricing Model(CAPM) Wasserstein distance derivative-free learning self-attention mechanism pre-trained model
  • 相关文献

同被引文献14

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部