摘要
首先采用长短时记忆单元替换递归神经网络隐含层中的神经元,避免梯度消失问题。其次将LSTM RNNLM应用在二次解码过程中。在语音解码时,递归神经网络语言模型使Lattice的扩展次数过多,导致搜索空间太大而影响搜索速度,因此Lattice不适宜引入高级语言模型进行重打分。相比之下,N-best的线性结构更适合引入包含长距离信息的模型,因此采用N-best进行LSTM RNNLM重打分,并对识别结果进行重排序。最后在Penn Treebank语料库和WSJ语料库上分别进行困惑度和连续语音识别实验。实验表明该方法有效降低语言模型的困惑度,提高连续语音识别系统的性能。
Firstly, the neurons in the hidden layer of the recurrent neural network are replaced by longshort-term memory units to avoid gradient vanishing. Secondly, we use LSTM RNNLM in strategy. In decoding stage, Lattice is not suitable for rescoring of recurrent neural network languagemodel which expands the lattice too many times, leading to searching efficiency reduction witli a blo-wing up search space. On the contrary, N-best algoritiim witii linear structure is more fitting for mod-els using long distance information. Therefore, the paper adopts N-best algorithm for LSTM RNNLMrescoring. The experimental results show that tiie proposed metiiod can not only effectively guage model perjDlexity, but also improve the performance of continuous speech recognition.
出处
《信息工程大学学报》
2017年第4期419-425,共7页
Journal of Information Engineering University
基金
国家自然科学基金资助项目(61175017
61403415)