期刊文献+

基于改进的LSTM深度神经网络语音识别研究 被引量:25

Research on Speech Recognition Based on Improved LSTM Deep Neural Network
下载PDF
导出
摘要 当前基于LSTM结构的神经网络语言模型中,在隐藏层引入了LSTM结构单元,这种结构单元包含一个信息储存较久的存储单元,对历史信息有良好的记忆功能.但LSTM中当前输入信息的状态不能影响到输出门最后的输出信息,对历史信息的获取较少.针对以上问题,笔者提出了基于改进的LSTM(long short-term memory)网络模型建模方法,该模型增加从当前输入门到输出门的连接,同时将遗忘门和输入门合成一个单一的更新门.信息通过输入门和遗忘门将过去与现在的记忆进行合并,可以选择遗忘之前累积的信息,使得改进的LSTM模型可以学到长时期的历史信息,解决了标准LSTM方法的缺点,具有更强的鲁棒性.采用基于改进的LSTM结构的神经网络语言模型,在TIMIT数据集上进行模型测试,结果表明,改进的LSTM识别错误率较标准的LSTM识别错误率降低了5%. The language model based on neural network LSTlVl structure, the LSTM structure used in the hid- den layer unit, the structure unit comprises a storage unit to store information for a long time, which has a good memory for historical information. But the LSTM in the current input information state does not affect the final output information of the output gate, get less historical information. To solve the above problems, this paper puts forward based on improved LSTM (long short-term memory) modeling method of network model. The model increases the connection from the current input gate to the output gate, and simultaneously com- bines the oblivious gate and the input gate into a single update gate. The door keeper input and forgotten past and present memory consolidation, can choose to forget before the accumulation of information, the improved LSTM model can learn the long history of information, solve the drawback of the LSTM method is more robust. This paper uses the neural network language LSTM model based on the improved model on TIMIT data sets show that the accuracy of test. The results illustrate that the improved LSTM identification error rate is 5% lower than the standard LSTM identification error rate.
作者 赵淑芳 董小雨 ZHAO Shufang;DONG Xiaoyu(Institute of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China)
出处 《郑州大学学报(工学版)》 CAS 北大核心 2018年第5期63-67,共5页 Journal of Zhengzhou University(Engineering Science)
基金 国家自然科学基金资助项目(61202163) "十二五"山西省科技重大专项资助项目(20121101001) 山西省教研项目(J2017078)
关键词 长短时记忆(LSTM) 深度神经网络 语音识别 long-short term memory(LSTM) deep neural network speech recognition
  • 相关文献

参考文献4

二级参考文献52

  • 1Chen Sinhorng,IEEE Trans Speech Audio Processing,1995年,3卷,2期,141页
  • 2BENGIO Y, DELALLEAU O. On the expressive power of deep archi- tectures[ C ]//Proc of the 14th International Conference on Discovery Science. Berlin : Springer-Verlag, 2011 : 18 - 36.
  • 3BENGIO Y. Leaming deep architectures for AI[ J]. Foundations and Trends in Machine Learning ,2009,2 ( 1 ) : 1-127.
  • 4HINTON G,OSINDERO S,TEH Y. A fast learning algorithm for deep belief nets [ J ]. Neural Computation ,2006,18 (7) : 1527-1554.
  • 5BENGIO Y, LAMBLIN P, POPOVICI D, et al. Greedy layer-wise training of deep networks [ C ]//Proc of the 12th Annual Conference on Neural Information Processing System. 2006:153-160.
  • 6LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning ap- plied to document recognition[ J]. Proceedings of the iEEE, 1998, 86( 11 ) :2278-2324.
  • 7VINCENT P, LAROCHELLE H, BENGIO Y, et al. Extracting and composing robust features with denoising autoencoders[ C ]//Proc of the 25th International Conference on Machine Learning. New York: ACM Press ,2008 : 1096-1103.
  • 8VINCENT P, LAROCHELLE H, LAJOIE I, et aL Stacked denoising autoencoders:learning useftd representations in a deep network with a local denoising criterion [ J ]. Journal of Machine Learning Re- search ,2010,11 ( 12 ) :3371-3408.
  • 9YU Dong, DENG Li. Deep convex net: a scalable architecture for speech pattern classification [ C]//Proc of the 12th Annual Confe-rence of International Speech Comunication Association. 2011 : 2285- 2288.
  • 10POON H, DOMINGOS P. Sum-product networks:a new deep architec- ture[ C ]//Proc of IEEE Intemational Conference on Computer Vi- sion. 2011:689-690.

共引文献1145

同被引文献195

引证文献25

二级引证文献101

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部