期刊文献+

基于双向改进门控循环单元维吾尔语语音识别 被引量:2

Uyghur Speech Recognition Based on Bidirectional Improved Gated Recurrent Unit
下载PDF
导出
摘要 为了能够提升语音识别的准确率,并有效降低训练模型的复杂度,提出了一种双向改进门控循环单元的声学模型语音识别方法。模型上移除重置门,在状态更新过程中采用ReLU激活函数并与前馈连接采用的BN算法有效结合,改进的模型可以降低模型的计算复杂度,加快模型收敛;采用双向的结构不仅可以有效帮助模型捕捉到过去和未来的语义时序信息,而且可以有效提升识别准确率。在THUYG-20维吾尔语数据集上实验结果表明,与基线传统深度神经网络进行对比,基于双向改进门控循环单元网络词错误率下降2.34%;与标准双向长短期记忆网络(LSTM)比较每个迭代周期平均训练时间减少13.4%。 In order to improve the accuracy of speech recognition and effectively reduce the complexity of training model,further in-depth research is needed in the model optimization.This paper presents an acoustic model based on bidirectional improved gated recurrent unit.The improved model can reduce the complexity of the model and accelerate the rapid convergence of the model.The bidirectional structure can not only effectively help the model capture the past and future time series information,but also effectively improve the recognition accuracy.The experimental results on thuyg-20 Uyghur corpus show that compared with the baseline traditional depth neural network,the absolute word error rate is reduced by 2.34%using bidirectional improved gated recurrent unit;the model can also reduces the per-epoch training time by 13.4%over standardb idirectional long short-term memory(LSTM)model.
作者 李连振 米吉提·阿不里米提 郑方 艾斯卡尔·艾木都拉 LI Lian-zhen;Mijit ABLIMIT;ZHENG Fang;Askar HAMDULLA(College of Information Science and Engineering,Xinjiang University,Urumqi Xinjiang 830046,China)
出处 《计算机仿真》 北大核心 2022年第11期275-279,共5页 Computer Simulation
基金 国家重点研发计划(2017YFC0820602)。
关键词 维吾尔语 语音识别 声学模型 门控循环单元 Uyghu Speech recognition Acoustic model Gated recurrent unit
  • 相关文献

参考文献5

二级参考文献53

  • 1蔡琴,吾守尔.斯拉木.基于HTK的维吾尔语连续数字语音识别[J].现代计算机,2007,13(4):14-16. 被引量:7
  • 2那斯尔江·吐尔逊,吾守尔·斯拉木.基于HMM的维吾尔语连续语音识别系统[D].乌鲁木齐:新疆大学,2008:272-278.
  • 3Andrew Ng, Jiquan Ngiam, Chuan Yu Foo, et al. Unsaper- vised feature learning and deep learning [R]. deeplearning. stanford, edu/wiki/inde php, 2013.
  • 4YU D Deng L. Deep learning and its relevance to signal and information processing [J]. IEEE Signal Processing Magazine, 2011, 28 (1): 145 154.
  • 5George Dahl, Yu D, Deng L, et al. Context-dependent Pre- trained deep neural networks to large vocabulary speech recogni- tion [J]. IEEE Transaction on Audio, Speech and Language Processing, 2012, 20 (1): 34-42.
  • 6Glorot X, Bengio Y. Understanding the difficulty of training deep feed-forward neural networks [J]. JMLP WCP, 2010, 9: 249-256.
  • 7Erhan D, Bengio Y, Courvelle A, et al. Why does unsuper vised pre-training help deep learning [J] Machine Learning Re-search, 2010, 12: 201-208.
  • 8Hinton G. A practical guide to training restricted Boltzmann machines [G]. LNCS 7700: Neural Networks: Tricks of the Trade, 2010.
  • 9Yu D, Deng L. Efficient and effective algorithms for training single-hidden-layer neural network [J]. Pattern Recognition Letters, 2012, 33 (5): 554-558.
  • 10Salakhutdinov R, Hinton G. A better way to pretrain deep Boltzmann machines [ C ] //NIPS Proceedings, 2012.. 2456-2464.

共引文献75

同被引文献17

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部