期刊文献+

基于m×2正则化交叉验证的神经网络超参数调优方法

A Method for Hyper-parameter Tuning of Neural Network Based on m×2 Regularized Cross-validation
下载PDF
导出
摘要 超参数调优是神经网络建模的关键问题。针对传统的超参数调优方法存在的问题,该文提出了一种基于m×2正则化交叉验证的超参数调优方法。目的是给出一种适用于复杂模型、大数据集背景下的计算开销较小且稳健的超参数调优方法。该方法的思想是从完整的数据集上选取少部分数据进行调优,避免模型在数据集较大时非常耗时的超参数调优难题;在m×2交叉验证的基础上设置正则化条件均衡训练集与验证集之间的分布差异,从而减少分布不一致带来的性能波动;使用信噪比作为调优的优化目标,从而可以综合考虑模型性能评价指标的均值和方差;并采用正交设计选择相关性较低的超参数组合以提高调优效率。以命名实体任务为例进行实验,在CoNLL 2003数据集上的实验结果显示,提出的调优方法能够选到和网格搜索性能上没有显著差异的超参数组合,且调优时间可显著降低约66%。 Hyper-parameter tuning is a key issue in neural network modeling.From the viewpoint of the problems of traditional hyper-parameter tuning methods,we propose a hyper-parameter tuning method based on m×2 regularized cross-validation.The goal is to present a robust hyper-parameter tuning method with low computational cost suitable for complex models and large datasets.The idea of the proposed method is to select a small number of data from the complete dataset for tuning,so as to avoid the time-consuming problem of hyper-parameter tuning when the dataset is large.Then,on the basis of m×2 cross-validation,regularization is adopted to balance the distribution difference between the training set and the validation set to reduce the performance fluctuation caused by the distribution inconsistency.The signal-to-noise ratio is used as the metric of hyper-parameter tuning,so that the mean and variance of the model performance can be comprehensively considered.The orthogonal design is used to select a combination of hyper-parameters with low correlation to improve the tuning efficiency.The experimental results on the CoNLL 2003 dataset show that the proposed method can obtain a combination of hyper-parameters that is not significantly different from the grid search,and the tuning time can be significantly reduced by about 66%.
作者 曹学飞 杨帆 李济洪 王瑞波 牛倩 CAO Xue-fei;YANG Fan;LI Ji-hong;WANG Rui-bo;NIU Qian(School of Automation and Software Engineering,Shanxi University,Taiyuan 030006,China;School of Modern Educational Technology,Shanxi University,Taiyuan 030006,China)
出处 《计算机技术与发展》 2024年第4期168-173,共6页 Computer Technology and Development
基金 国家自然科学基金(61806115,62076156)。
关键词 m×2交叉验证 正则化 神经网络 超参数调优 信噪比 m×2 cross-validation regularization neural network hyper-parameter tuning signal-to-noise
  • 相关文献

参考文献4

二级参考文献39

  • 1刘挺,车万翔,李生.基于最大熵分类器的语义角色标注[J].软件学报,2007,18(3):565-573. 被引量:73
  • 2周强.汉语基本块描述体系[J].中文信息学报,2007,21(3):21-27. 被引量:25
  • 3You L, Liu K. Building Chinese FrameNet Database [A]. Proceedings of IEEE NLP-KE' 05 [C]. Wuhan: IEEE, 2005.- 301-306.
  • 4Fillmore, Charles J. Frame semantics and the nature of language[A]. In Annals of the New York Academy of Sciences: Conference on the Origin and Development of Language and Speech[C]. 1976, 280:20-32.
  • 5Che WX, Li ZH, Li YQ, et al. Multilingual depend- ency-based syntactic and semantic parsing [A]. Pro- ceedings of the CoNLL-2009 [C], Boulder: ACL Press, 2009: 49-54.
  • 6Zhao H, Chen WL, Kit C, Zhou GD. Multilingual de- pendency learning: A huge feature engineering method to semantic dependency parsing[A]. Proceedings of the CoNLL-2009[C]. Boulder: ACL Press, 2009:55-60.
  • 7董静,孙乐,吕元华,冯元勇.基于线性链条件随机场模型的语义角色标注[A].中国中文信息学会二十五周年学术会议[C].2006.
  • 8Yu JD, Fan X, Pang W,Yu Z. Semantic role labeling based on conditional random fields [A]. Journal of Southeast University(English Edition). 2007, 23 (3) 361-364.
  • 9Sun HL,Jurafsky D. Shallow Semantic Parsing of Chi- nese[A]. Proceedings of NAACL-HLT 2004 [C]. 2004.
  • 10Xue NianWen. Labeling Chinese predicates with se- mantic rotes[J]. Computational Linguistics, 2008, 34 (2) :225-255.

共引文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部