摘要
随着深度学习处理问题的日益复杂,神经网络的层数、神经元个数、和神经元之间的连接逐渐增加,参数规模急剧膨胀,优化超参数来提高神经网络的预测性能成为一个重要的任务.文献中寻找最优参数的方法如灵敏度剪枝、网格搜索等,算法复杂而且计算量庞大.本文提出一种超参数优化的“删除垃圾神经元策略”.权重矩阵中权重均值小的神经元,在预测中的贡献可以忽略,称为垃圾神经元.该策略就是通过删除这些垃圾神经元得到精简的网络结构,来有效缩短计算时间,同时提高预测准确率和模型泛化能力.采用这一策略,长短期记忆网络模型对几种典型混沌动力系统的预测性能得到显著改善.
With the complexity of problems in reality increasing,the sizes of deep learning neural networks,including the number of layers,neurons,and connections,are increasing in an explosive way.Optimizing hyperparameters to improve the prediction performance of neural networks has become an important task.In literatures,the methods of finding optimal parameters,such as sensitivity pruning and grid search,are complicated and cost a large amount of computation time.In this paper,a hyperparameter optimization strategy called junk neuron deletion is proposed.A neuron with small mean weight in the weight matrix can be ignored in the prediction,and is defined subsequently as a junk neuron.This strategy is to obtain a simplified network structure by deleting the junk neurons,to effectively shorten the computation time and improve the prediction accuracy and model the generalization capability.The LSTM model is used to train the time series data generated by Logistic,Henon and Rossler dynamical systems,and the relatively optimal parameter combination is obtained by grid search with a certain step length.The partial weight matrix that can influence the model output is extracted under this parameter combination,and the neurons with smaller mean weights are eliminated with different thresholds.It is found that using the weighted mean value of 0.1 as the threshold,the identification and deletion of junk neurons can significantly improve the prediction efficiency.Increasing the threshold accuracy will gradually fall back to the initial level,but with the same prediction effect,more operating costs will be saved.Further reduction will result in prediction ability lower than the initial level due to lack of fitting.Using this strategy,the prediction performance of LSTM model for several typical chaotic dynamical systems is improved significantly.
作者
黄颖
顾长贵
杨会杰
Huang Ying Gu;Chang-Gui;Yang Hui-Jie(Business School,University of Shanghai for Science and Technology,Shanghai 200093,China)
出处
《物理学报》
SCIE
EI
CAS
CSCD
北大核心
2022年第16期77-85,共9页
Acta Physica Sinica
基金
国家自然科学基金(批准号:11875042,11505114)资助的课题。