期刊文献+

基于稳健设计的双向长短期记忆神经网络模型的调优方法 被引量:3

A Tuning Method for Bi-directional Long Short-Term Memory Neural Network Model Based on Robust Design
下载PDF
导出
摘要 双向长短期记忆神经网络模型在自然语言处理中广泛使用,但其调优问题是使用中的难点.本文以自然语言处理中的语义角色识别任务为例,在双向长短期记忆神经网络模型的调优中,将4个候选特征(词、词性、目标词和位置)和2个超参数(网络的层数和是否在顶层添加CRF分类器)看作稳健设计中的因子,设置各因子的水平,进行实验来选择特征和超参数的最优配置组合.本文在小数据集(6692条带有语义角色标注信息的例句)上以3×2交叉验证来做完全实验,以稳健设计的望大特性信噪比为优化目标,选出了模型的最优配置组合,并采用因子的方差分析,定量分析了各因子对模型性能的影响,使得模型有一定的可解释性.为了验证本文选出的最优配置组合的优良性,采用传统方法,在大数据集(约4万条例句)上以自然语言处理中常用的标准切分8:1:1,基于传统的贪心策略调优方法选出最优配置组合,并与本文方法在测试集进行比较,验证了本文的调优方法优于传统的调优方法. The bi-directional long short-term memory neural network model is widely used in natural language processing,but hyperparameter tuning of the model is difficult in practice.In this paper,we take the semantic role recognition task as an example,consider four candidate features(word,part of speech,target word and position)and two hyperparameters(the number of layers of the network and whether CRF classifier is used)as factors in robust design,and select the optimal combination of features and hyperparameters by setting levels of each factor and performing experiments.In particular,we perform 3×2 cross validation on a small datasets to select the optimal configuration combination of the model based on the SNR of robust design.Then,we analyze the influence of each factor on the performance of the model by quantitatively analyze so that the model has a certain degree of interpretability.Moreover,in order to verify the superiority of our tuning method,we use the standard segmentation of natural language processing on a big dataset,adopt the traditional greedy strategy to select the optimal configuration combination,and compare with our method on the test set.The results show that our method is better than the traditional tuning method.
作者 曹学飞 李济洪 王瑞波 牛倩 王钰 CAO Xuefei;LI Jihong;WANG Ruibo;NIU Qian;WANG Yu(School of Software,Shanxi University,Taiyuan,030006,China;School of Modern Education Technology,Shanxi University,Taiyuan,030006,China)
出处 《应用概率统计》 CSCD 北大核心 2022年第3期317-332,共16页 Chinese Journal of Applied Probability and Statistics
基金 国家自然科学基金项目(批准号:62076156、61806115)资助.
关键词 稳健设计 语义角色识别 长短期记忆神经网络 3×2交叉验证 robust design semantic role recognition long short-term memory neural network 3×2 cross validation
  • 相关文献

参考文献3

二级参考文献60

  • 1刘挺,车万翔,李生.基于最大熵分类器的语义角色标注[J].软件学报,2007,18(3):565-573. 被引量:73
  • 2周强.汉语基本块描述体系[J].中文信息学报,2007,21(3):21-27. 被引量:25
  • 3Chen SF, Rosenfeld R. A Gaussian prior for smoothing maximum entropy models. Technical Report, CMU-CS-99-108, 1999.
  • 4Gildea D, Jurafsky D. Automatic labeling of semantic roles. Computational Linguistics, 2002,28(3):245-288.
  • 5Baker CF, Fillmore CJ, Lowe JB. The Berkeley FrameNet project. In: Boitet C, Whitelock P, eds. Proc. of the ACL&Coling'98.Montreal: ACL, 1998. 86-90.
  • 6Palmer M, Gildea D, Kingsbury P. The Proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 2005,31(1):71-106.
  • 7Erk K, Kowalski A, Pado S, Pinkal M. Towards a resource for lexical semantics: A large german corpus with extensive semantic annotation. In: Hinrichs EW, Roth D, eds. Proc. of the ACL 2003. Sapporo: ACL, 2003. 537-544.
  • 8Chen J, Rainbow O. Use of deep linguistic features for the recognition and labeling of semantic arguments. In: Hinrichs EW, Roth D, eds. Proc. of the EMNLP 2003. Sapporo: ACL, 2003.41-48.
  • 9Nielsen RD, Pradhan S. Mixing weak learners in semantic parsing. In: Lin D, Wu D, eds. Proc. of the EMNLP 2004. Barcelona:ACL, 2004. 80-87.
  • 10Pradhan S, Hacioglu K, Krugler V, Ward W, Martin JH, Jurafsky D. Support vector learning for semantic argument classification.Machine Learning Journal, 2005,60(3): 11-39.

共引文献107

同被引文献20

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部