期刊文献+

一种跨层连接的循环神经网络门结构设计 被引量:3

Design of recurrent neural network based on CIGU
下载PDF
导出
摘要 针对循环神经网络(RNN)结构在深层网络中收敛较慢和训练效果较差的问题,分析了长短期记忆(LSTM)和Highway网络的门结构特征,提出了一种将层间信息进行跨层连接的门结构单元(CIGU)。结合循环神经网络时间扩展的特点,通过设计层间门结构,使CIGU模型在空间上反向梯度下降时能够像LSTM在时间上传播一样具有长短期记忆能力,从而加强循环神经网络在空间上的深度学习能力。将设计的结构应用到LSTM中,并通过PTB语言数据集对不同的门结构进行训练和测试。结果表明:随着模型层数的加深,CIGU的训练收敛速度和测试结果比传统LSTM和基于Highway网络结构的LSTM显著提高。 Aiming at problem that recurrent neural network (RNN) structure has drawbacks of slow convergence and poor training effect in deep network, gated structural characteristics of long short-term memory ( LSTM ) and highway network are analyzed, and a cross-layer integration gated unit(CIGU) which connects information between layers is proposed. Combined with characteristics of time extension of RNN, by designing inter-layer gate structure, let CIGU model can have long and short-term memory abilities when it has backward gradient descent in space, which is similar to LSTM when propagating in time, so as to strengthens deep learning ability of RNN in space. The proposed structure is applied to LSTM, and train and test on different gate structures by PTB language data sets. Results show that with the deepening of model layers ,speed of training convergence and testing results of CIGU are significantly improved compared with traditional LSTM and LSTM based on highway network structure.
作者 余昉恒 沈海斌 YU Fang-heng;SHEN Hai-bin(Institute of Very Large Scale Integration circuit Design,Zhejiang University,Hangzhou 310027,China)
出处 《传感器与微系统》 CSCD 2018年第8期91-93,共3页 Transducer and Microsystem Technologies
基金 国家自然科学基金资助项目(61371032)
关键词 循环神经网络 长短期记忆 Highway网络 门结构 recurrent neural network(RNN) long short-term memory(LSTM) Highway network gated unit
  • 相关文献

参考文献4

二级参考文献25

  • 1钱跃良,林守勋,刘群,刘宏.2005年度863计划中文信息处理与智能人机接口技术评测回顾[J].中文信息学报,2006,20(B03):1-6. 被引量:4
  • 2STOLCKEA, SRILM. An extensible language modelingtoolkit [ C ]//INTERSPEECH 2010,7th InternationalConference on Spoken Language Processing. Denver Col-orado :The International Speech Communication Associa-tion, 2002:901-904.
  • 3HEAFIELD K,KENLM : Faster and smaller languagemodel queries [ C ] //Proceedings of the Sixth Workshopon Statistical Machine Translation. Portland : The Associ-ation for Computational Linguistics,2011 : 187-197.
  • 4HEAFIELDK,POUZYREVSKY I,CLARK J H,et al.Scalable Modified Kneser-Ney Language Model Estimation[C ] //Proceedings of the 51st Annual Meeting of the As-sociation for Computational Linguistics. Sofia Bulgaria:The Association for Computational Linguistics, 2013 :690-696.
  • 5MIKOLOV T. Statistical language models based on neuralnetworks[D]. Prague :Bmo University of Technology ,2012.
  • 6PAPPAS N,MEYER T. A Survey on Language Modelingusing Neural Networks, No. EPFL-REPORT-192566[R]. Martigny : Idiap, 2012.
  • 7BENGIO Y, DUCHARME R,VINCENT P, et al. Aneural probabilistic language model [ J]. The Journal ofMachine Learning Research, 2003( 13) : 1137-1155.
  • 8SCHWENK H. Continuous space language models [ J].Computer Speech & Language, 2007,21(3) : 492-518.
  • 9MIKOLOV T,KARAFIaT M, BURGET L, et al. Recur-rent neural network based language model[ C]//INTER-SPEECH 2010,11th Annual Conference of the Interna-tional Speech Communication Association. MakuhariChi-ba : The International Speech Communication Association,2010: 1045-1048.
  • 10MIKOLOV T, KOMBRINK S,DEORAS A, et al.RNNLM-Recurrent neural network language modelingtoolkit [ C ]// Proceeding of the 2011 ASRU Workshop.Waikoloa,Hawaii : Institute of Electrical and ElectronicEngineers,2011 : 196-201.

共引文献61

同被引文献17

引证文献3

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部