Deep Neural Networks(DNNs)have become the tool of choice for machine learning practitioners today.One important aspect of designing a neural network is the choice of the activation function to be used at the neurons o...Deep Neural Networks(DNNs)have become the tool of choice for machine learning practitioners today.One important aspect of designing a neural network is the choice of the activation function to be used at the neurons of the different layers.In this work,we introduce a four-output activation function called the Reflected Rectified Linear Unit(RRe LU)activation which considers both a feature and its negation during computation.Our activation function is"sparse",in that only two of the four possible outputs are active at a given time.We test our activation function on the standard MNIST and CIFAR-10 datasets,which are classification problems,as well as on a novel Computational Fluid Dynamics(CFD)dataset which is posed as a regression problem.On the baseline network for the MNIST dataset,having two hidden layers,our activation function improves the validation accuracy from 0.09 to 0.97 compared to the well-known Re LU activation.For the CIFAR-10 dataset,we use a deep baseline network that achieves 0.78 validation accuracy with 20 epochs but overfits the data.Using the RRe LU activation,we can achieve the same accuracy without overfitting the data.For the CFD dataset,we show that the RRe LU activation can reduce the number of epochs from 100(using Re LU)to 10 while obtaining the same levels of performance.展开更多
修正线性单元(rectified linear unit,ReLU)是深度卷积神经网络常用的激活函数,但当输入为负数时,ReLU的输出为零,造成了零梯度问题;且当输入为正数时,ReLU的输出保持输入不变,使得ReLU函数的平均值恒大于零,引起了偏移现象,从而限制了...修正线性单元(rectified linear unit,ReLU)是深度卷积神经网络常用的激活函数,但当输入为负数时,ReLU的输出为零,造成了零梯度问题;且当输入为正数时,ReLU的输出保持输入不变,使得ReLU函数的平均值恒大于零,引起了偏移现象,从而限制了深度卷积神经网络的学习速率和学习效果.针对ReLU函数的零梯度问题和偏移现象,根据"输出均值接近零的激活函数能够提升神经网络学习性能"原理对其进行改进,提出SLU(softplus linear unit)函数.首先,对负数输入部分进行softplus处理,使得负数输入时SLU函数的输出为负,从而输出平均值更接近于零,减缓了偏移现象;其次,为保证梯度平稳,对SLU的参数进行约束,并固定正数部分的参数;最后,根据SLU对正数部分的处理调整负数部分的参数,确保激活函数在零点处连续可导,信息得以双向传播.设计深度自编码模型在数据集MINST上进行无监督学习,设计网中网卷积神经网络模型在数据集CIFAR-10上进行监督学习.实验结果表明,与ReLU及其相关改进单元相比,基于SLU函数的神经网络模型具有更好的特征学习能力和更高的学习精度.展开更多
文摘Deep Neural Networks(DNNs)have become the tool of choice for machine learning practitioners today.One important aspect of designing a neural network is the choice of the activation function to be used at the neurons of the different layers.In this work,we introduce a four-output activation function called the Reflected Rectified Linear Unit(RRe LU)activation which considers both a feature and its negation during computation.Our activation function is"sparse",in that only two of the four possible outputs are active at a given time.We test our activation function on the standard MNIST and CIFAR-10 datasets,which are classification problems,as well as on a novel Computational Fluid Dynamics(CFD)dataset which is posed as a regression problem.On the baseline network for the MNIST dataset,having two hidden layers,our activation function improves the validation accuracy from 0.09 to 0.97 compared to the well-known Re LU activation.For the CIFAR-10 dataset,we use a deep baseline network that achieves 0.78 validation accuracy with 20 epochs but overfits the data.Using the RRe LU activation,we can achieve the same accuracy without overfitting the data.For the CFD dataset,we show that the RRe LU activation can reduce the number of epochs from 100(using Re LU)to 10 while obtaining the same levels of performance.
文摘修正线性单元(rectified linear unit,ReLU)是深度卷积神经网络常用的激活函数,但当输入为负数时,ReLU的输出为零,造成了零梯度问题;且当输入为正数时,ReLU的输出保持输入不变,使得ReLU函数的平均值恒大于零,引起了偏移现象,从而限制了深度卷积神经网络的学习速率和学习效果.针对ReLU函数的零梯度问题和偏移现象,根据"输出均值接近零的激活函数能够提升神经网络学习性能"原理对其进行改进,提出SLU(softplus linear unit)函数.首先,对负数输入部分进行softplus处理,使得负数输入时SLU函数的输出为负,从而输出平均值更接近于零,减缓了偏移现象;其次,为保证梯度平稳,对SLU的参数进行约束,并固定正数部分的参数;最后,根据SLU对正数部分的处理调整负数部分的参数,确保激活函数在零点处连续可导,信息得以双向传播.设计深度自编码模型在数据集MINST上进行无监督学习,设计网中网卷积神经网络模型在数据集CIFAR-10上进行监督学习.实验结果表明,与ReLU及其相关改进单元相比,基于SLU函数的神经网络模型具有更好的特征学习能力和更高的学习精度.