期刊文献+

基于LSTM RNNLM的N-best重打分算法 被引量:4

N-Best Rescoring Algorithm Based on Long Short-Term Memory Recurrent Neural Network Language Model
下载PDF
导出
摘要 首先采用长短时记忆单元替换递归神经网络隐含层中的神经元,避免梯度消失问题。其次将LSTM RNNLM应用在二次解码过程中。在语音解码时,递归神经网络语言模型使Lattice的扩展次数过多,导致搜索空间太大而影响搜索速度,因此Lattice不适宜引入高级语言模型进行重打分。相比之下,N-best的线性结构更适合引入包含长距离信息的模型,因此采用N-best进行LSTM RNNLM重打分,并对识别结果进行重排序。最后在Penn Treebank语料库和WSJ语料库上分别进行困惑度和连续语音识别实验。实验表明该方法有效降低语言模型的困惑度,提高连续语音识别系统的性能。 Firstly, the neurons in the hidden layer of the recurrent neural network are replaced by longshort-term memory units to avoid gradient vanishing. Secondly, we use LSTM RNNLM in strategy. In decoding stage, Lattice is not suitable for rescoring of recurrent neural network languagemodel which expands the lattice too many times, leading to searching efficiency reduction witli a blo-wing up search space. On the contrary, N-best algoritiim witii linear structure is more fitting for mod-els using long distance information. Therefore, the paper adopts N-best algorithm for LSTM RNNLMrescoring. The experimental results show that tiie proposed metiiod can not only effectively guage model perjDlexity, but also improve the performance of continuous speech recognition.
机构地区 信息工程大学
出处 《信息工程大学学报》 2017年第4期419-425,共7页 Journal of Information Engineering University
基金 国家自然科学基金资助项目(61175017 61403415)
关键词 LSTM 递归神经网络 语言模型 N-best重打分 LSTM recurrent neural network language model N-best rescoring
  • 相关文献

参考文献1

二级参考文献17

  • 1R. Rosenfeld, "Two decades of statistical language modeling: Where do we go from here?", Proceedings of the IEEE, Vol.88, No.8. D.1270--1278, 2000.
  • 2I. Oparin, M. Sundermeyer, H. Ney, J.L. Gauvain, "Perfor- mance analysis of neural networks in combination with n-gram language models", Proceedings of ICASSPI2, Kyoto, Japan, pp.5005-5008, 2012.
  • 3;F. Mikolov, M. Karafiat, L. Burget, J. Cernocky, S. Khudanpur, "Recurrent neural network based language model", Eleventh Annual Conference of the International Speech Communication Association, Chiba, Japan, pp.1045-1048, 2010.
  • 4T. Mikolov, A. Deoras, S. Kombrink, L. Burget, J. Cernocky, "Empirical evaluation and combination of advanced language modeling techniques", Twelfth Annual Conference of the Iner- national Speech Communication Association, Florence, Italy, pp.605-608, 2011.
  • 5S. Kombrink, T. Mikolov, M. Karafiat, L. Burget, "Recurrent neural network based language modeling in meeting recogni- tion", Twelfth Annual Conference of the International Speech Communication Association, Florence, Italy, pp.2877-2880, 2011.
  • 6H. Schvcenk, "Continuous space language models", Computer Speech Language, Vol.21, No.3, pp.492-518, 2007.
  • 7T. Mikolov, S. Kombrink, L. Burget, J. Cernocky, S. Khudan- pur, "Extensions of recurrent neural network language model", 2011 IEEE International Conference on. Acoustics, Speech and Signal Processing (ICASSP), IEEE, Prague, Czech Republic, pp.5528-5531, 2011.
  • 8T. Mikolov, A. Deoras, D. Povey, L. Burget, "Strategies for training large scale neural network language models", ASRU 2011, Hawaii, USA, pp.196-201, 2011.
  • 9K. Chen, W. Bao, H. Chi, "Speed up training of the recur- rent neural network based on constrained optimization tech- niques", Journal of Computer Science and Technology, Vol. 11, No.6, pp.581-588, 1996.
  • 10G. Lecorve, P. Motlicek, "Conversion of recurrent neural net- work language models to weighted finite state transducers for automatic speech recognition", Eleventh Annual Conference of the International Speech Communication Association, Port- land, Oregon, USA, pp.5032-5035, 2012.

共引文献2

同被引文献12

引证文献4

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部