Deep Neural Networks(DNNs)have become the tool of choice for machine learning practitioners today.One important aspect of designing a neural network is the choice of the activation function to be used at the neurons o...Deep Neural Networks(DNNs)have become the tool of choice for machine learning practitioners today.One important aspect of designing a neural network is the choice of the activation function to be used at the neurons of the different layers.In this work,we introduce a four-output activation function called the Reflected Rectified Linear Unit(RRe LU)activation which considers both a feature and its negation during computation.Our activation function is"sparse",in that only two of the four possible outputs are active at a given time.We test our activation function on the standard MNIST and CIFAR-10 datasets,which are classification problems,as well as on a novel Computational Fluid Dynamics(CFD)dataset which is posed as a regression problem.On the baseline network for the MNIST dataset,having two hidden layers,our activation function improves the validation accuracy from 0.09 to 0.97 compared to the well-known Re LU activation.For the CIFAR-10 dataset,we use a deep baseline network that achieves 0.78 validation accuracy with 20 epochs but overfits the data.Using the RRe LU activation,we can achieve the same accuracy without overfitting the data.For the CFD dataset,we show that the RRe LU activation can reduce the number of epochs from 100(using Re LU)to 10 while obtaining the same levels of performance.展开更多
修正线性单元(rectified linear unit,ReLU)是深度卷积神经网络常用的激活函数,但当输入为负数时,ReLU的输出为零,造成了零梯度问题;且当输入为正数时,ReLU的输出保持输入不变,使得ReLU函数的平均值恒大于零,引起了偏移现象,从而限制了...修正线性单元(rectified linear unit,ReLU)是深度卷积神经网络常用的激活函数,但当输入为负数时,ReLU的输出为零,造成了零梯度问题;且当输入为正数时,ReLU的输出保持输入不变,使得ReLU函数的平均值恒大于零,引起了偏移现象,从而限制了深度卷积神经网络的学习速率和学习效果.针对ReLU函数的零梯度问题和偏移现象,根据"输出均值接近零的激活函数能够提升神经网络学习性能"原理对其进行改进,提出SLU(softplus linear unit)函数.首先,对负数输入部分进行softplus处理,使得负数输入时SLU函数的输出为负,从而输出平均值更接近于零,减缓了偏移现象;其次,为保证梯度平稳,对SLU的参数进行约束,并固定正数部分的参数;最后,根据SLU对正数部分的处理调整负数部分的参数,确保激活函数在零点处连续可导,信息得以双向传播.设计深度自编码模型在数据集MINST上进行无监督学习,设计网中网卷积神经网络模型在数据集CIFAR-10上进行监督学习.实验结果表明,与ReLU及其相关改进单元相比,基于SLU函数的神经网络模型具有更好的特征学习能力和更高的学习精度.展开更多
For neural networks(NNs)with rectified linear unit(ReLU)or binary activation functions,we show that their training can be accomplished in a reduced parameter space.Specifically,the weights in each neuron can be traine...For neural networks(NNs)with rectified linear unit(ReLU)or binary activation functions,we show that their training can be accomplished in a reduced parameter space.Specifically,the weights in each neuron can be trained on the unit sphere,as opposed to the entire space,and the threshold can be trained in a bounded interval,as opposed to the real line.We show that the NNs in the reduced parameter space are mathematically equivalent to the standard NNs with parameters in the whole space.The reduced parameter space shall facilitate the optimization procedure for the network training,as the search space becomes(much)smaller.We demonstrate the improved training performance using numerical examples.展开更多
平整机轧制力的预报对轧制过程的优化控制有着重要意义。针对平整机轧制力预测精度不高的问题,提出采用Re LU(Rectified Linear Units)激活函数的神经网络模型来预报平整机的轧制力。在对数据进行主成分分析后,得到影响轧制力的主要因素...平整机轧制力的预报对轧制过程的优化控制有着重要意义。针对平整机轧制力预测精度不高的问题,提出采用Re LU(Rectified Linear Units)激活函数的神经网络模型来预报平整机的轧制力。在对数据进行主成分分析后,得到影响轧制力的主要因素,并将其作为神经网络的输入层,将平整机轧制力作为输出层,通过使用Python语言编程进行实验,对神经网络模型隐层的相关参数及算法进行单一变量筛选,建立了保证轧制力预报精度最高的神经网络模型。实验结果表明,通过调整隐层层数、神经元数、传播算法、正则化方法,该模型能够将预测误差控制在10%以内,且该实验方法能够对不同输入参数下的平整机轧制力进行精确预报。展开更多
文摘Deep Neural Networks(DNNs)have become the tool of choice for machine learning practitioners today.One important aspect of designing a neural network is the choice of the activation function to be used at the neurons of the different layers.In this work,we introduce a four-output activation function called the Reflected Rectified Linear Unit(RRe LU)activation which considers both a feature and its negation during computation.Our activation function is"sparse",in that only two of the four possible outputs are active at a given time.We test our activation function on the standard MNIST and CIFAR-10 datasets,which are classification problems,as well as on a novel Computational Fluid Dynamics(CFD)dataset which is posed as a regression problem.On the baseline network for the MNIST dataset,having two hidden layers,our activation function improves the validation accuracy from 0.09 to 0.97 compared to the well-known Re LU activation.For the CIFAR-10 dataset,we use a deep baseline network that achieves 0.78 validation accuracy with 20 epochs but overfits the data.Using the RRe LU activation,we can achieve the same accuracy without overfitting the data.For the CFD dataset,we show that the RRe LU activation can reduce the number of epochs from 100(using Re LU)to 10 while obtaining the same levels of performance.
文摘修正线性单元(rectified linear unit,ReLU)是深度卷积神经网络常用的激活函数,但当输入为负数时,ReLU的输出为零,造成了零梯度问题;且当输入为正数时,ReLU的输出保持输入不变,使得ReLU函数的平均值恒大于零,引起了偏移现象,从而限制了深度卷积神经网络的学习速率和学习效果.针对ReLU函数的零梯度问题和偏移现象,根据"输出均值接近零的激活函数能够提升神经网络学习性能"原理对其进行改进,提出SLU(softplus linear unit)函数.首先,对负数输入部分进行softplus处理,使得负数输入时SLU函数的输出为负,从而输出平均值更接近于零,减缓了偏移现象;其次,为保证梯度平稳,对SLU的参数进行约束,并固定正数部分的参数;最后,根据SLU对正数部分的处理调整负数部分的参数,确保激活函数在零点处连续可导,信息得以双向传播.设计深度自编码模型在数据集MINST上进行无监督学习,设计网中网卷积神经网络模型在数据集CIFAR-10上进行监督学习.实验结果表明,与ReLU及其相关改进单元相比,基于SLU函数的神经网络模型具有更好的特征学习能力和更高的学习精度.
文摘For neural networks(NNs)with rectified linear unit(ReLU)or binary activation functions,we show that their training can be accomplished in a reduced parameter space.Specifically,the weights in each neuron can be trained on the unit sphere,as opposed to the entire space,and the threshold can be trained in a bounded interval,as opposed to the real line.We show that the NNs in the reduced parameter space are mathematically equivalent to the standard NNs with parameters in the whole space.The reduced parameter space shall facilitate the optimization procedure for the network training,as the search space becomes(much)smaller.We demonstrate the improved training performance using numerical examples.
文摘平整机轧制力的预报对轧制过程的优化控制有着重要意义。针对平整机轧制力预测精度不高的问题,提出采用Re LU(Rectified Linear Units)激活函数的神经网络模型来预报平整机的轧制力。在对数据进行主成分分析后,得到影响轧制力的主要因素,并将其作为神经网络的输入层,将平整机轧制力作为输出层,通过使用Python语言编程进行实验,对神经网络模型隐层的相关参数及算法进行单一变量筛选,建立了保证轧制力预报精度最高的神经网络模型。实验结果表明,通过调整隐层层数、神经元数、传播算法、正则化方法,该模型能够将预测误差控制在10%以内,且该实验方法能够对不同输入参数下的平整机轧制力进行精确预报。