摘要
为深入研究激活函数的作用机制,探讨优良激活函数应具备的性质,以提高卷积神经网络模型的泛化能力,文章综述了激活函数的发展,分析得到优良激活函数应具备的性质。激活函数大体可分为“S型”激活函数、“ReLU型”激活函数、组合型激活函数、其他类型激活函数。在深度学习发展初期,“S型”激活函数得到了广泛应用。随着网络模型的加深,“S型”激活函数出现了“梯度消失”问题。ReLU激活函数的出现缓解了这一问题,但ReLU负半轴“置0”则引入了“神经元坏死”的问题。随后出现的改进激活函数大多基于ReLU负半轴进行改动,以缓减“神经元坏死”。文章最后以多层感知机为例,推导了优良激活函数在前向、反向传播中的作用,并得出其应该具备的性质。
In order to study the mechanism of the activation function in depth and discuss the properties of a good activation function to improve the generalization ability of the convolutional neural network model,the article reviews the development of the activation function and analyzes the properties that a good activation function should have.Activation functions can be roughly divided into"S-type"activation functions,"ReLU-type"activation functions,combined activation functions,and other types of activation functions.In the early stage of the development of deep learning,the"S-type"activation function has been widely used.With the deepening of the network model,it’s problem of"gradient disappearance"was found grandually.The emergence of the ReLU activation function alleviates this problem,but the negative half-axis of ReLU"set to 0"introduces the problem of"neuronal necrosis".Most of the subsequent improved activation functions were modified based on the negative semi-axis of ReLU to slow down"neuronal necrosis".At the end of the article,taking the multilayer perceptron as an example,the role of a good activation function in forward and backward propagation is deduced,and the properties that it should possess are derived.
作者
张焕
张庆
于纪言
ZHANG Huan;ZHANG Qing;YU Jiyan(National Defense Key Discipline Laboratory of Intelligent Ammunition Technology,School of Mechanical Engineering,Nanjing University of Science and Technology,Nanjing 210094 China)
出处
《西华大学学报(自然科学版)》
CAS
2021年第4期1-10,共10页
Journal of Xihua University:Natural Science Edition
基金
国防科学技术预先研究基金项目(KO01071)。
关键词
深度学习
卷积神经网络
激活函数
反向传播
ReLU
deep learning
convolutional neural network
activation function
back propagation
ReLU