An iterative procedure introduced in MacKay’s evidence framework is often used for estimating the hyperparameter in empirical Bayes.Together with the use of a particular form of prior,the estimation of the hyperparam...An iterative procedure introduced in MacKay’s evidence framework is often used for estimating the hyperparameter in empirical Bayes.Together with the use of a particular form of prior,the estimation of the hyperparameter reduces to an automatic relevance determination model,which provides a soft way of pruning model parameters.Despite the effectiveness of this estimation procedure,it has stayed primarily as a heuristic to date and its application to deep neural network has not yet been explored.This paper formally investigates the mathematical nature of this procedure and justifies it as a well-principled algorithm framework,which we call the MacKay algorithm.As an application,we demonstrate its use in deep neural networks,which have typically complicated structure with millions of parameters and can be pruned to reduce the memory requirement and boost computational efficiency.In experiments,we adopt MacKay algorithm to prune the parameters of both simple networks such as LeNet,deep convolution VGG-like networks,and residual netowrks for large image classification task.Experimental results show that the algorithm can compress neural networks to a high level of sparsity with little loss of prediction accuracy,which is comparable with the state-of-the-art.展开更多
基金This work was supported partly by China Scholarship Council(201706020062)by China 973 program(2015CB358700)+2 种基金by the National Natural Science Foundation of China(Grant Nos.61772059,61421003)by the Beijing Advanced Innovation Center for Big Data and Brain Computing(BDBC)State Key Laboratory of Software Development Environment(SKLSDE-2018ZX-17).
文摘An iterative procedure introduced in MacKay’s evidence framework is often used for estimating the hyperparameter in empirical Bayes.Together with the use of a particular form of prior,the estimation of the hyperparameter reduces to an automatic relevance determination model,which provides a soft way of pruning model parameters.Despite the effectiveness of this estimation procedure,it has stayed primarily as a heuristic to date and its application to deep neural network has not yet been explored.This paper formally investigates the mathematical nature of this procedure and justifies it as a well-principled algorithm framework,which we call the MacKay algorithm.As an application,we demonstrate its use in deep neural networks,which have typically complicated structure with millions of parameters and can be pruned to reduce the memory requirement and boost computational efficiency.In experiments,we adopt MacKay algorithm to prune the parameters of both simple networks such as LeNet,deep convolution VGG-like networks,and residual netowrks for large image classification task.Experimental results show that the algorithm can compress neural networks to a high level of sparsity with little loss of prediction accuracy,which is comparable with the state-of-the-art.