摘要
深度神经网络模型在图像识别、语音识别等领域表现出了优异的性能,但高性能的模型对计算资源提出了更高的要求,存在难以部署于边缘设备的问题,对此提出一种基于知识蒸馏的差异性深度集成学习。首先对成员模型进行知识蒸馏,然后使用余弦相似度作为损失函数的正则化项对成员模型进行集成,最后得到训练好的模型。在MNIST(Mixed National Institute of Standards and Technology)和CIFAR10(Canadian Institute for Advanced Research)数据集上的试验结果表明,基于知识蒸馏的差异性深度集成学习在压缩模型的同时将模型的分类准确率提升至83.58%,相较于未经蒸馏的原始模型,分类准确率提高了4%,在压缩模型的同时提高模型的泛化性能。基于知识蒸馏的差异性深度集成学习打破了模型的压缩必然以泛化性能为代价这一认知,为模型集成提供了新的研究思路。
Deep Neural Networks model has achieved significant progress in many tasks,such as image recognition and speech recognition.However,the high-performance model raises higher requirements for computing resources,and is difficult to deploy on edge devices.For this reason,the differential deep ensemble learning was proposed on the basis of knowledge distillation.Firstly the member model was distilled by knowledge,then the cosine similarity was used as the regularization term of the loss function for ensemble training,and finally the trained model was obtained.The experimental results on MNIST(Mixed National Institute of Standards and Technology)and CIFAR10(Canadian Institute for Advanced Research)data sets show that the differential deep ensemble learning based on knowledge distillation can compress the model and increase the classification accuracy of the model to 83.58%,4%higher than that of the original model without distillation,which means that the differential deep ensemble learning can compress the model and improve the generalization performance of the model.Differential deep ensemble learning based on knowledge distillation breaks the stereotype that model compression is inevitable at the cost of generalization performance,which provides a new research idea for model ensemble.
作者
张锡敏
钱亚冠
马丹峰
郭艳凯
康明
ZHANG Ximin;QIAN Yaguan;MA Danfeng;GUO Yankai;KANG Ming(School of Sciences,Zhejiang University of Science and Technology,Hangzhou 310023,Zhejiang,China)
出处
《浙江科技学院学报》
CAS
2021年第3期220-226,共7页
Journal of Zhejiang University of Science and Technology
基金
国家自然科学基金项目(61902082)
浙江省自然科学基金项目(LY17F020011)
浙江省公益技术应用研究计划项目(LGG19F030001)。
关键词
知识蒸馏
差异性集成
深度神经网络
knowledge distillation
differential ensemble
Deep Neural Networks