期刊文献+

基于零模正则的神经网络剪枝方法

Pruning Approach to Neural Networks Based on Zero-norm Regularization
下载PDF
导出
摘要 本文提出一种有效的神经网络剪枝方法。该方法对神经网络训练模型引入零模正则项来促使模型权重稀疏,并通过删减取值为零的权重来压缩模型。对所提出的零模正则神经网络训练模型,文中通过建立其等价MPEC形式的全局精确罚得到其等价的局部Lipschitz代理,然后通过用交替方向乘子法求解该Lipschitz代理模型对网络进行训练、剪枝。最后,对MLP和LeNet-5网络模型进行测试,分别在误差2.2%和1%下,取得97.43%和99.50%的稀疏度,达到很好的剪枝效果。 Deep Neural Network(DNN)has become ubiquitous in our daily life ranging from autonomous driving to smart home.It has become an inevitable trend to introduce DNN model into mobile devices and embedded systems.The redundancy of parameters has always been the main reason for hindering neural network inference and making it difficult to deploy on mobile system.In recent years,academia and industry have proposed many methods for model compression,such as model compression,knowledge distillation,and network pruning.Neural network pruning,as an important means of network model compression,reduces network parameters by removing some neural connections,effectively overcoming the high computational cost and high memory resource proportion caused by neural network weight redundancy.Our method in this article is a further extension of the network pruning model and solving algorithm.In this work,we propose an effective pruning method for neural networks against the problem of high computational costs and considerable memory bandwidth caused by huge complexity and parameters redundancy of neural network model.This method improves the sparsity of model weights by introducing zero-norm regularized term into the neural network model,and compresses the model by deleting those zero weights.For the proposed zero-norm regularized neural network model,by establishing the global exact penalty for its equivalent MPEC form,we obtain an equivalent Lipschitz surrogate.Based on the equivalent local Lipschitz surrogate,considering that when the activation function is sigmod,the loss function of the final optimization model is a combination of smooth and non-smooth terms,and the smooth part can be solved through existing frameworks,while the non-smooth part has an exact expression,we design an proximal alternating direction multiplier method(P-ADMM)to solve the smooth loss model induced by sigmod activation function.Numerical experiments conducted for P-ADMM validate their efficiency.The tests for the MLP and LeNet-5 network respectively yield 97.43%and 99.50%sparsity without the loss of accuracy.The results of numerical experiment show that our method effectively reduces the complexity of the model,and has better sparse ratio compared with other pruning methods.Meanwhile,it has the advantages of convenient implementation and easy extension.This article proposes a(P-ADMM)method for solving the smooth loss network pruning model.For the highly non convexity of the neural network model,although the paper utilizes alternating solution and the computational graph framework to solve the model,the convergence speed of the algorithm is slow in the later stage.Therefore,one of the future research directions is whether to propose an acceleration strategy to improve the convergence rate of the algorithm,and whether to directly solve the non-convex and non-smooth model using gradient methods for backpropagation algorithms and computational graph frameworks.Another interesting research direction is how to design effective algorithms to find a solution when the smooth loss function is non smooth,and what convergence properties the algorithm possesses.
作者 柳智 LIU Zhi(School of Mathematics,South China University of Technology,Guangzhou 510000,China)
出处 《运筹与管理》 CSCD 北大核心 2023年第10期102-107,共6页 Operations Research and Management Science
基金 国家自然科学基金面上项目(11971177)。
关键词 神经网络剪枝 零模正则 交替方向乘子法 network pruning zero-norm regularization ADMM
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部