有监督深度学习的优化方法研究综述被引量：7

Review of optimization methods for supervised deep learning

导出

摘要随着大数据的普及和算力的提升,深度学习已成为一个热门研究领域,但其强大的性能过分依赖网络结构和参数设置。因此,如何在提高模型性能的同时降低模型的复杂度,关键在于模型优化。为了更加精简地描述优化问题,本文以有监督深度学习作为切入点,对其提升拟合能力和泛化能力的优化方法进行归纳分析。给出优化的基本公式并阐述其核心;其次,从拟合能力的角度将优化问题分解为3个优化方向,即收敛性、收敛速度和全局质量问题,并总结分析这3个优化方向中的具体方法与研究成果;从提升模型泛化能力的角度出发,分为数据预处理和模型参数限制两类对正则化方法的研究现状进行梳理;结合上述理论基础,以生成对抗网络(generative adversarial network,GAN)变体模型的发展历程为主线,回顾各种优化方法在该领域的应用,并基于实验结果对优化效果进行比较和分析,进一步给出几种在GAN领域效果较好的优化策略。现阶段,各种优化方法已普遍应用于深度学习模型,能够较好地提升模型的拟合能力,同时通过正则化缓解模型过拟合问题来提高模型的鲁棒性。尽管深度学习的优化领域已得到广泛研究,但仍缺少成熟的系统性理论来指导优化方法的使用,且存在几个优化问题有待进一步研究,包括无法保证全局梯度的Lipschitz限制、在GAN中找寻稳定的全局最优解,以及优化方法的可解释性缺乏严格的理论证明。 Deep learning technique has been developing intensively in big data era.However,its capability is still chal⁃lenged for the design of network structure and parameter setting.Therefore,it is essential to improve the performance of the model and optimize the complexity of the model.Machine learning can be segmented into five categories in terms of learn⁃ing methods:1)supervised learning,2)unsupervised learning,3)semi-supervised learning,4)deep learning,and 5)reinforcement learning.These machine learning techniques are required to be incorporated in.To improve its fitting and generalization ability,we select supervised deep learning as a niche to summarize and analyze the optimization methods.First,the mechanism of optimization is demonstrated and its key elements are illustrated.Then,the optimization problem is decomposed into three directions in relevant to fitting ability:1)convergence,2)convergence speed,and 3)globalcontext quality.At the same time,we also summarize and analyze the specific methods and research results of these three optimization directions.Among them,convergence refers to running the algorithm and converging to a synthesis like a sta⁃tionary point.The gradient exploding/vanishing problem is shown that small changes in a multi-layer network may amplify and stimuli or decline and disappear for each layer.The speed of convergence refers to the ability to assist the model to con⁃verge at a faster speed.After the convergence task of the model,the optimization algorithm to accelerate the model conver⁃gence should be considered to improve the performance of the model.The global-context quality problem is to ensure that the model converges to a lower solution(the global minimum).The first two problems are local-oriented and the last one is global-concerned.The boundary of these three problems is fuzzy,for example,some optimization methods to improve con⁃vergence can accelerate the convergence speed of the model as well.After the fitting optimization of the model,it is neces⁃sary to consider the large number of parameters in the deep learning model as well,which can cause poor generalization effect due to overfitting.Regularization can be regarded as an effective method for generalization.To improve the general⁃ization ability of the model,current situation of regularization methods are categorized from two aspects:1)data processing and 2)model parameters-constrained.Data processing refers to data processing during model training,such as dataset enhancement,noise injection and adversarial training.These optimization methods can improve the generalization ability of the model effectively.Model parameters constraints are oriented to parameters-constrained in the network,which can also improve the generalization ability of the model.We take generative adversarial network(GAN)as the application background and review the growth of its variant model because it can be as a commonly-used deep learning network.We analyze the application of relevant optimization methods in GAN domain from two aspects of fitting and generalization abil⁃ity.Taking WGAN with gradient penalty(WGAN-GP)as the basic model,we design an experiment on MNIST-10 dataset to study the applicability of the six algorithms(stochastic gradient method(SGD),momentum SGD,Adagrad,Adadelta,root mean square propagation(RMSProp),and Adam)in the context of deep learning based GAN domain.The optimiza⁃tion effects are compared and analyzed in relevant to the experimental results of multiple optimization methods on variants of GAN model,and some GAN-based optimization strategies are required to be clarified further.At present,various optimi⁃zation methods have been widely used in deep learning models.Various optimization methods to improve the fitting ability can improve the performance of the model.Furthermore,these regularized optimization methods are beneficial to alleviate the problem of model overfitting and improve the robustness of the model.But,there is still a lack of systematic theories and mechanisms for guidance.In addition,there are still some optimization problems to be further studied.The Lipschitz limitation of global gradients is not guaranteed in deep neural networks due to the gap between theory and practice.In the field of GAN,there is still a lack of theoretical breakthroughs to find the stable global optimal solution,that is,the optimal Nash equilibrium.Moreover,some of the existing optimization methods are empirical and its interpretability is lack of clear theoretical proof.There are many and complex optimization methods in deep learning.The use of various optimization methods should be focused on the integrated effect of multiple optimizations.Our critical analysis is potential to provide a reference for the optimization method selection in the design of deep neural network.

作者江铃燚郑艺峰陈澈李国和张文杰 Jiang Lingyi;Zheng Yifeng;Chen Che;Li Guohe;Zhang Wenjie(College of Computer Science,Minnan Normal University,Zhangzhou 363000,China;Key Laboratory of Data Science and Intelligence Application,Fujian Province University,Zhangzhou 363000,China;College of Information Science and Engineering,China University of Petroleum,Beijing 102249,China)

机构地区闽南师范大学计算机学院数据科学与智能应用福建省高校重点实验室中国石油大学信息科学与工程学院

出处《中国图象图形学报》 CSCD 北大核心 2023年第4期963-983,共21页 Journal of Image and Graphics

基金国家自然科学基金项目(62141602) 福建省自然科学基金项目(2021J011004,2021J011002) 克拉玛依科技发展计划项目(2020CGZH0009) 教育部产学研创新计划(2021LDA09003)。

关键词机器学习深度学习深度学习优化正则化生成对抗网络(GAN) machine learning deep learning deep learning optimization regularization generative adversarial network(GAN)

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献2

1陈波冯,李靖东,卢兴见,沙朝锋,王晓玲,张吉.基于深度学习的图异常检测技术综述[J].计算机研究与发展,2021,58(7):1436-1455. 被引量：15
2段宾,符祥,江毅,曾接贤.结合GAN的轻量级模糊车牌识别算法[J].中国图象图形学报,2020,25(9):1813-1824. 被引量：14

二级参考文献5

1冉令峰.基于垂直投影的车牌字符分割方法[J].通信技术,2012,45(4):89-91. 被引量：26
2罗栩豪,王培,肖怀成,李帅,孙冬冬.基于改进隐马尔可夫特征的车牌识别技术[J].国外电子测量技术,2017,36(9):99-103. 被引量：2
3韩涛,兰雨晴,肖利民,刘艳芳.一种增量并行式动态图异常检测算法[J].北京航空航天大学学报,2018,44(1):117-124. 被引量：8
4任家东,刘新倩,王倩,何海涛,赵小林.基于KNN离群点检测和随机森林的多层入侵检测方法[J].计算机研究与发展,2019,56(3):566-575. 被引量：73
5郭彤宇,王博,刘悦,魏颖.多通道融合可分离卷积神经网络下的脑部磁共振图像分割[J].中国图象图形学报,2019,24(11):2009-2020. 被引量：8

共引文献27

1袁程,曹爱青.基于自适应卷积神经网络的普通场景车牌识别[J].信息与电脑,2021,33(2):156-158. 被引量：1
2杨霞.基于D-S理论的模糊图像智能识别方法研究[J].智能计算机与应用,2021,11(8):183-184. 被引量：1
3李净.国际视野下治理虚假新闻的技术手段及相关模型[J].中国传媒科技,2021(8):17-21.
4郑琳,王福龙.车牌检测与识别算法研究[J].智能计算机与应用,2021,11(9):131-137. 被引量：4
5郑琳,王福龙.改进HOG特征的车牌识别算法[J].软件导刊,2022,21(5):193-197. 被引量：4
6陆志香,杨梅.基于卷积神经网络的复杂光照变化车牌图像识别[J].激光杂志,2022,43(5):145-150. 被引量：6
7刘华玲,刘雅欣,许珺怡,陈尚辉,乔梁.图异常检测在金融反欺诈中的应用研究进展[J].计算机工程与应用,2022,58(22):41-53. 被引量：5
8陈益芳,宣羿,樊立波,孙智卿,屠永伟,张亦涵,蔡乾晨.基于机器学习的电网威胁检测算法模型和大数据平台设计[J].电力大数据,2022,25(4):34-41. 被引量：3
9关晓艳,李亚.基于改进ResNet网络的有遮挡车牌识别[J].农业装备与车辆工程,2022,60(11):58-63. 被引量：3
10唐立,郝鹏,任沛阁,张祖耀,何翔,张学军.基于改进孤立森林算法的无人机异常行为检测[J].航空学报,2022,43(8):578-587. 被引量：7

同被引文献86

1甄沐华,陈鹏,王坤,范子杨,王者.基于关键词挖掘的热线文本数据犯罪线索筛查方法研究[J].知识管理论坛,2022(5):539-548. 被引量：1
2叶春凯,万旺根.基于特征金字塔网络的多视图深度估计[J].电子测量技术,2020(11):91-95. 被引量：4
3姚砺,王昭丽.基于深度学习的驾驶证识别方法研究[J].智能计算机与应用,2020,10(7):40-43. 被引量：4
4肖衡.基于IRNet单阶段弱监督学习的语义分割方法[J].国外电子测量技术,2021,40(12):30-36. 被引量：3
5魏瑞斌.基于关键词的情报学研究主题分析[J].情报科学,2006,24(9):1400-1404. 被引量：135
6谭满春,冯荦斌,徐建闽.基于ARIMA与人工神经网络组合模型的交通流预测[J].中国公路学报,2007,20(4):118-121. 被引量：68
7仲伟俊,梅姝娥,谢园园.产学研合作技术创新模式分析[J].中国软科学,2009(8):174-181. 被引量：172
8周学广,高飞,孙艳.基于依存连接权VSM的子话题检测与跟踪方法[J].通信学报,2013,34(8):1-9. 被引量：10
9何清,李宁,罗文娟,史忠植.大数据下的机器学习算法综述[J].模式识别与人工智能,2014,27(4):327-336. 被引量：327
10邱学信.证卡技术及证卡防伪技术[J].印刷技术,2001(27):9-11. 被引量：1

引证文献7

1晋军伟,钱彬,虞力英,王军华,顾席光.基于小样本的境外驾驶证分类方法[J].中国公共安全,2023(4):53-59.
2蒋鑫烁.基于数字图像法的道路裂缝检测综述[J].河南建材,2024(3):61-63.
3李云飞,张巧芬,王桂棠,胡海泽,蔡志勇,陈楚浜.基于改进PointNet++的室内点云语义分割模型[J].国外电子测量技术,2023,42(12):63-69. 被引量：2
4余辉,魏梓萌,夏文蕾,黄炜,陈晓芳.跨领域跨时间的技术需求热点分布及其趋势预测[J].情报理论与实践,2024,47(5):139-147.
5邓睿,胡继华,范红伟,李明洋,黄建亮.基于深度学习的机器视觉实验系统设计[J].机电工程技术,2024,53(5):183-186. 被引量：1
6黄兆军,曾明如.小型无人有缆遥控水下机器人智能控制方法[J].实验室研究与探索,2024,43(7):34-38.
7白海洋,林俊宪,陈家合,张柳,周璇滢.基于YOLOv5算法的水位智能监测系统[J].计算机科学与应用,2023,13(6):1244-1256.

二级引证文献3

1刘丹丹,胡伟,王丽欢,赵健,任雨,王迪,余容.基于PointNet++的工井点云语义分割模型研究[J].电力大数据,2024,27(2):77-86.
2刘洪凯,王少红,左云波,谷玉海.室内障碍物点云分割的可变阈值联合聚类算法研究[J].电子测量技术,2024,47(9):70-78.
3汪健,熊义祖,魏子涵.基于机器学习的烟叶除杂工序优化设计[J].计算机应用文摘,2024,40(21):83-85.

1齐卫平.中国式现代化的历史意蕴[J].前线,2023(3):25-28.
2王宝华,杨晓,朱鹏飞.深度学习在图像识别中的应用:一项比较研究[J].魅力中国,2021(15):16-18.
3沈艳.心理护理对ICU重症患者护理质量的改善[J].中文科技期刊数据库（文摘版）医药卫生,2022(11):163-165.
4李晨晨,周志峰.融合IMU的四足机器人腿部里程计[J].农业装备与车辆工程,2023,61(4):104-108.
5付红波.城市复杂环境下不同加权模型对动态SPP定位性能影响分析[J].测绘通报,2023(3):139-143. 被引量：2
6花涛,覃翠,包永强,王保升.新工科背景下应用型本科院校“电磁场与微波技术”课程理论与实验一体化教学设计研究[J].科技风,2023(4):92-94. 被引量：2
7张骏,王红成.双裁切近端策略优化算法[J].计算机系统应用,2023,32(4):177-186.
8史凯岳,李凤莲,张雪英,杜海文,于丽君.深度强化学习优化的学习向量量化聚类算法[J].电子设计工程,2023,31(9):43-48.
9胡国胜,季斌,王健.改性零价铁去除水中硝酸盐的研究进展[J].武汉理工大学学报,2022,44(10):25-29.
10冀东普,赵瑞,郑建常,陈时军.山东地区地震活动参数定量化研究[J].地震地磁观测与研究,2022,43(S01):484-486.

中国图象图形学报

2023年第4期

浏览历史

内容加载中请稍等...

有监督深度学习的优化方法研究综述被引量：7

参考文献2

二级参考文献5

共引文献27

同被引文献86

引证文献7

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

有监督深度学习的优化方法研究综述 被引量：7

参考文献2

二级参考文献5

共引文献27

同被引文献86

引证文献7

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

有监督深度学习的优化方法研究综述被引量：7