基于双生成器网络的Data-Free知识蒸馏被引量：1

Double-Generators Network for Data-Free Knowledge Distillation

下载PDF

导出

摘要知识蒸馏(knowledge distillation,KD)通过最大化近似输出分布使“教师网络”指导“学生网络”充分训练,成为大规模深度网络近端迁移、部署及应用的重要技术.然而,隐私保护意识增强与传输问题加剧使网络训练数据难以获取.如何在Data-Free的自由环境下,保证压缩网络准确率成为重要的研究方向.Data-Free学生网络学习(data-free learning of student networks,DAFL)模型,建立“教师”端生成器获得与预训练网络分布近似的伪数据集,通过知识蒸馏训练“学生网络”.然而,该框架中生成器构建及优化仍存在2个问题:1)过度信任“教师网络”对缺失真实标签伪样本的判别结果,同时,“教师网络”与“学生网络”优化目标不同,使“学生网络”难以获得准确、一致的优化信息;2)仅依赖于“教师网络”训练损失,导致数据特征多样性缺失,降低“学生网络”泛化性.针对这2个问题,提出双生成器网络架构DG-DAFL(double generators-DAFL),分别建立“教师”与“学生”端生成器并同时优化,实现网络任务与优化目标一致,提升“学生网络”判别性能.进一步,增加双生成器样本分布差异损失,利用“教师网络”潜在分布先验信息优化生成器,保证“学生网络”识别准确率并提升泛化性.实验结果表明,该方法在Data-Free环境中获得了更为有效且更鲁棒的知识蒸馏效果.DG-DAFL方法代码及模型已开源:https://github.com/LNNU-computer-research-526/DG-DAFL.git. Knowledge distillation(KD)maximizes the similarity of output distributions between teacher-network and student-network to achieve network compression and the large-scale network proximal-end deployment and application.However,the privacy protection and transmission problems result in that the training data are difficultly collected.In the scenario of training data shortage that is called data-free,improving the performance of KD is a meaningful task.Data-free learning(DAFL)builds up teacher-generator to obtain pseudo data that are similar as real samples,and then pseudo data are utilized to train student-network by distilling.Nevertheless,the training process of teacher-generator will produce both problems:1)Absolutely trusting the discrimination outputs of teacher-network maybe include incorrectly information from unlabeled pseudo data,moreover,teacher-network and student-network have different learning targets.Therefore,it is difficult to obtain the accuracy and coincident information for training student-network.2)Over-dependences loss values originated from teacher-network,which induces pseudo data with un-diversity damaging the generalization of student-network.Aim to resolve above problems,we propose a double generators network framework DG-DAFL for data-free by building up double generators.In DG-DAFL,studentnetwork and teacher-network obtain the same learning tasks by optimizing double generators at the same time,which enhances the performance of student-network.Moreover,we construct the distribution loss between student-generator and teacher-generator to enrich sample diversity and further improve the generalization of student-network.According to the results of experiments,our method achieves the more efficient and robust performances in three popular datasets.The code and model of DG-DAFL are published in https://github.com/LNNU-computer-research-526/DGDAFL.git.

作者张晶鞠佳良任永功 Zhang Jing;Ju Jialiang;Ren Yonggong(School of Computer Science and Artificial Intelligence,Liaoning Normal University,Dalian,Liaoning 116081)

机构地区辽宁师范大学计算机与人工智能学院

出处《计算机研究与发展》 EI CSCD 北大核心 2023年第7期1615-1627,共13页 Journal of Computer Research and Development

基金国家自然科学基金项目(61902165,61976109) 大连市科技创新基金项目(2018J12GX047) 教育部人文社会科学研究规划基金项目(21YJC880104)。

关键词深度神经网络知识蒸馏无数据环境知识蒸馏对抗生成网络生成器 deep neural network knowledge distillation data-free knowledge distillation generative adversarial network generator

分类号 TP183 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

同被引文献6

1王泽华.计算机技术在办公自动化中的应用研究[J].电子乐园,2021(4):106-106. 被引量：1
2王宏飞.办公自动化中的信息技术应用探讨[J].中国信息技术教育,2014(12):51-51. 被引量：2
3张昆.互联网+时代背景下计算机应用技术改革的探索[J].科技与创新,2021(15):170-171. 被引量：4
4陈浩.计算机技术在办公自动化中的应用[J].科技与创新,2022(1):142-144. 被引量：5
5邹易奇.人工智能在计算机网络技术中的应用[J].科技与创新,2022(20):179-180. 被引量：13
6吴俊锋,王文,汪亮,陶先平,胡昊,吴海军.基于两阶段意图共享的多智能体强化学习方法[J].计算机学报,2023,46(9):1820-1837. 被引量：1

引证文献1

1牛月.计算机技术在办公自动化中的应用[J].科技创新与应用,2024,14(8):187-190. 被引量：3

二级引证文献3

1何峻.基于文档理解的办公自动化应用方案探索[J].华东科技,2024(6):64-67.
2刘志兰.浅论办公自动化在办公管理中的作用[J].办公自动化,2024,29(17):91-93.
3杨念敏.计算机应用技术在企业信息化建设中的应用[J].电子商务评论,2024,13(2):2470-2475.

1林兰如.运用国学经典提升小学语文教学效果探析[J].福建教育研究,2023(6):80-81.
2Kashif NOOR,Hafiza Masooma Naseer CHEEMA,Asif Ali KHAN,Rao Sohail Ahmad KHAN.Expression profiling of transgenes(Cry1Ac and Cry2A) in cotton genotypes under different genetic backgrounds[J].Journal of Integrative Agriculture,2022,21(10):2818-2832. 被引量：1
3胡义,黄勃淳,李凡.基于transformer自适应特征向量融合的图像分类[J].光电子．激光,2023,34(6):602-609. 被引量：1
4谢林洋,陈伊琳,谢林洙.新时代背景下企业管理会计发展对策研究[J].企业改革与管理,2023(9):145-146. 被引量：2
5周雨晴,许蓝云,王羚丽.家庭教育对高职学生心理健康的影响[J].幸福（下）,2023(5):56-57.
6刘然,刘岩,吴卓航,李冬雪,马强.结合边缘信息优化LSC和改进A^(*)算法的正射影像图镶嵌线提取[J].南京信息工程大学学报（自然科学版）,2023,15(4):403-411.
7吴绍东.高层室内无线通信网络分布系统的设计与优化分析[J].通信电源技术,2023,40(7):9-11.
8冯凯,刘梦婷,王祥宇,张文豪.双全桥无线电能双向传输和信息反向传输系统[J].信息与电脑,2023,35(6):140-142.
9黄伟,李华键,林芷颐.瓦楞纸在儿童文创产品设计中的应用研究[J].新传奇,2023(5):61-63.
10董明宇,严迪群,王让定,董理.基于语音性别分类的对抗样本研究[J].网信军民融合,2022(11):45-50.

计算机研究与发展

2023年第7期

浏览历史

内容加载中请稍等...

基于双生成器网络的Data-Free知识蒸馏被引量：1

同被引文献6

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于双生成器网络的Data-Free知识蒸馏 被引量：1

同被引文献6

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于双生成器网络的Data-Free知识蒸馏被引量：1