期刊文献+

神经网络超参数优化的删除垃圾神经元策略 被引量:2

Junk-neuron-deletion strategy for hyperparameter optimization of neural networks
下载PDF
导出
摘要 随着深度学习处理问题的日益复杂,神经网络的层数、神经元个数、和神经元之间的连接逐渐增加,参数规模急剧膨胀,优化超参数来提高神经网络的预测性能成为一个重要的任务.文献中寻找最优参数的方法如灵敏度剪枝、网格搜索等,算法复杂而且计算量庞大.本文提出一种超参数优化的“删除垃圾神经元策略”.权重矩阵中权重均值小的神经元,在预测中的贡献可以忽略,称为垃圾神经元.该策略就是通过删除这些垃圾神经元得到精简的网络结构,来有效缩短计算时间,同时提高预测准确率和模型泛化能力.采用这一策略,长短期记忆网络模型对几种典型混沌动力系统的预测性能得到显著改善. With the complexity of problems in reality increasing,the sizes of deep learning neural networks,including the number of layers,neurons,and connections,are increasing in an explosive way.Optimizing hyperparameters to improve the prediction performance of neural networks has become an important task.In literatures,the methods of finding optimal parameters,such as sensitivity pruning and grid search,are complicated and cost a large amount of computation time.In this paper,a hyperparameter optimization strategy called junk neuron deletion is proposed.A neuron with small mean weight in the weight matrix can be ignored in the prediction,and is defined subsequently as a junk neuron.This strategy is to obtain a simplified network structure by deleting the junk neurons,to effectively shorten the computation time and improve the prediction accuracy and model the generalization capability.The LSTM model is used to train the time series data generated by Logistic,Henon and Rossler dynamical systems,and the relatively optimal parameter combination is obtained by grid search with a certain step length.The partial weight matrix that can influence the model output is extracted under this parameter combination,and the neurons with smaller mean weights are eliminated with different thresholds.It is found that using the weighted mean value of 0.1 as the threshold,the identification and deletion of junk neurons can significantly improve the prediction efficiency.Increasing the threshold accuracy will gradually fall back to the initial level,but with the same prediction effect,more operating costs will be saved.Further reduction will result in prediction ability lower than the initial level due to lack of fitting.Using this strategy,the prediction performance of LSTM model for several typical chaotic dynamical systems is improved significantly.
作者 黄颖 顾长贵 杨会杰 Huang Ying Gu;Chang-Gui;Yang Hui-Jie(Business School,University of Shanghai for Science and Technology,Shanghai 200093,China)
出处 《物理学报》 SCIE EI CAS CSCD 北大核心 2022年第16期77-85,共9页 Acta Physica Sinica
基金 国家自然科学基金(批准号:11875042,11505114)资助的课题。
关键词 LSTM 混沌时间序列预测 超参数优化 删除垃圾神经元策略 LSTM chaotic time series prediction hyperparameter optimization junk neuron deletion strategy
  • 相关文献

参考文献8

二级参考文献30

  • 1张小桃,倪维斗,李政,郑松.基于现场数据热工对象建模的可辨识性[J].清华大学学报(自然科学版),2004,44(11):1544-1547. 被引量:16
  • 2张小桃,倪维斗,李政,郑松.基于现场数据与神经网络的热工对象动态建模[J].热能动力工程,2005,20(1):34-37. 被引量:16
  • 3徐耀群,孙明.混沌神经网络时间序列的研究[C]∥中国控制与决策学术年会论文集.沈阳:东北大学出版社,2006:397-402.
  • 4Jiang Jianguo,Shao Kuizhi,Wei Yuheng,et al. Chaotic Neural Network Model for Output Prediction of Polymer Flooding [ C] // Proceedings of the 2007. IEEE, International Conference on Mechatronics and Automation. Harbin, Heilong jiang, China: IEEE, 2007 : 2347 - 2351.
  • 5BAUER A, WOLLHERR D, BUSS M. Human-robot collaboration: a survey[J]. International Journal of Humanoid Robotics, 2008, 5(1): 47 - 66.
  • 6JAN G E, CHANG K Y, PAR.BERRY I. Optimal path planning for mobile robot navigation[J]. IEEE-ASME Transactions on Mechatrioics, 2008, 13(4): 451 - 460.
  • 7BUSONIU L, BABUSKA R, DE SCHUTTER B. A comprehensive survey of multiagent reinforcement learning[J]. IEEE Transactions on Systems, Man and Cybernetics. 2008, 38(2): 156 - 172.
  • 8CARRERSA M, YUB J K, BATLLE J, et al. Application of SONQL for real-time learning of robot behaviors[J]. Robotics and Autonomous System, 2007, 55(8): 628 - 642.
  • 9ARLEO A, SMERALDI E GERSTNER W. Cognitive navigation based on nonuniform Gabor space sampling unsupervised growing networks and reinforcement learning[J]. IEEE Transactions on Neural Networks, 2004, 15(3): 639- 652.
  • 10MAX L, LIKHAREV K K. Global reinforcement learning in neural networks[J]. IEEE Transactions on Neural Networks, 2007, 18(2): 573 - 577.

共引文献96

同被引文献25

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部