摘要
针对教师-学生网络存在的超参数温度控制效率低、时间开销大的问题,提出了一种基于自适应温度的小教师网络辅助训练的可解释模型.在原有教师-学生模型结构的基础上,首先,说明超参数温度只与学生模型的训练收敛速度相关;然后,加入小教师模型结构,节约了解释模型的训练时间.在图像分类的验证实验中,解释模型在cifar-100图像数据集的平均TOP-1准确率相比原有方法提高了2.45%,处理时间节约了26.33%.提出的方法能对待解释模型进行全局近似,是一种处理时间较短的事后可解释方法.
Aiming at the low efficiency problem and high time cost of hyperparameter temperature in teacher-student network,an interpretable model based on adaptive temperature assisted training of the small teacher network was proposed.On the basis of the original teacher-student model structure,firstly,it shows that the temperature hyperparameter is only related to the training convergence speed of the student model.Secondly,the small teacher model structure was added to save the training time of the interpretation model.In the verification experiment of image classification,the accuracy of the interpretation model in cifar-100 is increased by 2.45%compared with the original model,and the processing time is saved by 26.33%.The proposed method can make a global approximation to the interpretation model,and it is an ex-post interpretable method with a short processing time.
作者
贲可荣
王天雨
张献
BEN Kerong;WANG Tianyu;ZHANG Xian(College of Electronic Engineering,Navy University of Engineering,Wuhan 430033,China)
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2022年第2期124-129,共6页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金
军队十三五国防预研项目.
关键词
可解释性
事后可解释方法
模型蒸馏
自适应温度
小教师网络
interpretability
ex-post interpretable method
model distillation
adaptive temperature
small teacher network