期刊文献+

AccSMBO:一种基于超参梯度和元学习的SMBO加速算法 被引量:1

AccSMBO:Using Hyperparameters Gradient and Meta-Learning to Accelerate SMBO
下载PDF
导出
摘要 为了利用最佳超参高概率范围和超参梯度,提出了加速的序列模型优化算法(sequential model-based optimization algorithms,SMBO)——AccSMBO算法.AccSMBO使用了具有良好抗噪能力的基于梯度的多核高斯过程回归方法,利用元学习数据集的meta-acquisition函数.AccSMBO自然对应的并行算法则使用了基于元学习数据集的并行算法资源调度方案.基于梯度的多核高斯过程回归可以避免超参梯度噪音对拟合高斯过程的影响,加快构建较好超参效果模型的速度.meta-acquisition函数通过读取元学习数据集,总结最佳超参高概率范围,加快最优超参搜索.在AccSMBO自然对应的并行算法中,并行资源调度方法使更多的并行计算资源用于计算最佳超参高概率范围中的超参,更快探索最佳超参高概率范围.上述3个技术充分利用超参梯度和最佳超参高概率范围加速SMBO算法.在实验中,相比于基于传统的SMBO算法实现的SMAC(sequential model-based algorithm configuration)算法、基于梯度下降的HOAG(hyperparameter optimization with approximate gradient)算法和常用的随机搜索算法,AccSMBO使用最少的资源找到了效果最好的超参. Current machine learning models require numbers of hyperparameters.Adjusting those hyperparameters is an exhausting job.Thus,hyperparameters optimization algorithms play important roles in machine learning application.In hyperparameters optimization algorithms,sequential model-based optimization algorithms(SMBO)and parallel SMBO algorithms are state-of-the-art hyperpara-meter optimization methods.However,(parallel)SMBO algorithms do not take the best hyperpara-meters high possibility range and gradients into considerasion.It is obvious that best hyperparameters high possibility range and hyperparameter gradients can accelerate traditional hyperparameters optimization algorithms.In this paper,we accelerate the traditional SMBO method and name our method as AccSMBO.In AccSMBO,we build a novel gradient-based multikernel Gaussian process.Our multikernel Gaussian process has a good generalization ability which reduces the gradient noise influence on SMBO algorithm.And we also design meta-acquisition function and parallel resource allocation plan which encourage that(parallel)SMBO puts more attention on the best hyperpara-meters high possibility range.In theory,our method ensures that all hyperparameter gradient information and the best hyperparameters high possibility range information are fully used.In L2 norm regularised logistic loss function experiments,on different scales datasets:small-scale dataset Pc4,middle-scale dataset Rcv1,large-scale dataset Real-sim,compared with state-of-the-art gradient based algorithm:HOAG and state-of-the-art SMBO algorithm:SMAC,our method exhibits the best performance.
作者 程大宁 张汉平 夏粉 李士刚 袁良 张云泉 Cheng Daning;Zhang Hanping;Xia Fen;Li Shigang;Yuan Liang;Zhang Yunquan(University of Chinese Academy of Sciences,Beijing 100190;Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;Wisdom Uranium Technology Co.Ltd,Beijing 100190;Swiss Federal Institute of Technology Zurich,Zurich,Switzerland 8914;University at Buffalo,The State University of New York,New York 14260)
出处 《计算机研究与发展》 EI CSCD 北大核心 2020年第12期2596-2609,共14页 Journal of Computer Research and Development
基金 国家自然科学基金项目(61432018,61521092,61272136,61521092,61502450) 国家重点研发计划项目(2016YFB0200803) 北京自然科学基金项目(L1802053)。
关键词 AutoML技术 SMBO算法 黑箱调优算法 超参梯度 元学习 并行资源调度 AutoML SMBO black box optimization hypergradient metalearning parallel resource allocation
  • 相关文献

参考文献1

二级参考文献14

  • 1MarkoffJ. How many computers to identify a cat?[NJ The New York Times, 2012-06-25.
  • 2MarkoffJ. Scientists see promise in deep-learning programs[NJ. The New York Times, 2012-11-23.
  • 3李彦宏.2012百度年会主题报告:相信技术的力量[R].北京:百度,2013.
  • 410 Breakthrough Technologies 2013[N]. MIT Technology Review, 2013-04-23.
  • 5Rumelhart D, Hinton G, Williams R. Learning representations by back-propagating errors[J]. Nature. 1986, 323(6088): 533-536.
  • 6Hinton G, Salakhutdinov R. Reducing the dimensionality of data with neural networks[J]. Science. 2006, 313(504). Doi: 10. 1l26/science. 1127647.
  • 7Dahl G. Yu Dong, Deng u, et a1. Context-dependent pre?trained deep neural networks for large vocabulary speech recognition[J]. IEEE Trans on Audio, Speech, and Language Processing. 2012, 20 (1): 30-42.
  • 8Jaitly N. Nguyen P, Nguyen A, et a1. Application of pretrained deep neural networks to large vocabulary speech recognition[CJ //Proc of Interspeech , Grenoble, France: International Speech Communication Association, 2012.
  • 9LeCun y, Boser B, DenkerJ S. et a1. Backpropagation applied to handwritten zip code recognition[J]. Neural Computation, 1989, I: 541-551.
  • 10Large Scale Visual Recognition Challenge 2012 (ILSVRC2012)[OLJ.[2013-08-01J. http://www. image?net.org/challenges/LSVRC/2012/.

共引文献611

同被引文献7

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部