期刊文献+

基于DNN的声学模型自适应实验 被引量:5

An Experiment of Acoustic Model Adaptation Based on Deep Neural Network
下载PDF
导出
摘要 声学模型自适应算法研究目的是缓解由测试数据和训练数据不匹配而引起的识别性能下降问题.基于深度神经网络(DNN)模型框架的自适应技术中,重训练是最直接的方法,但极容易出现过拟合现象,尤其是自适应数据稀疏的情况下.文章针对领域相关的自动语音识别任务,对典型的两种声学模型自适应算法进行了尝试,实验了基于线性变换网络的自适应方法和基于相对熵正则化准则的自适应方法,并对两种算法进行了详尽的系统性能比较.结果表明,在不同的自适应数据量下,相对熵正则化自适应方法均能表现出较好的性能. Acoustic model adaptation algorithm aims at reducing the recognition performance degradation caused by the mismatch between training and testing data. Among the many adaptation techniques based on deep neural net- work (DNN), retraining is the most straightforward way. However it is prone to over-fitting, especially when adap- tation data is sparse. In this paper, two typical acoustic adaptation methods, namely linear transformation network adaptation and Kullback-Leibler divergence regularization adaptation, are experimentally explored for task- adaptation purpose. An elaborate comparison is made, and results show that KL divergence regularization technique achieves better performance under different amounts of adaptation data.
出处 《天津大学学报(自然科学与工程技术版)》 EI CAS CSCD 北大核心 2015年第9期765-770,共6页 Journal of Tianjin University:Science and Technology
基金 国家高技术研究发展计划(863计划)资助项目(2012AA012503) 中国科学院战略性先导科技专项(XDA06030100,XDA 06030500) 国家自然科学基金资助项目(11461141004,91120001,61271426) 中科院重点部署资助项目(KYGD-EW-103-2)
关键词 声学模型自适应 语音识别 深度神经网络 acoustic model adaptation speech recognition deep neural network (DNN)
  • 相关文献

参考文献17

  • 1Seide F, Li G, Yu D. Conversational speech transcrip- tion using context-dependent deep neural networks [C]// Proceedings of the 12th Annual Conference of the Inter- national Speech Communication Association. Florence, Italy, 2011: 437-440.
  • 2Dahl G E, Yu D, Deng L, et al. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition[J]. 1EEE Transactions on Audio, Speech, and Language Processing, 2012, 20(1): 30- 42.
  • 3Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups[J~. Signal Processing Magazine, IEEE, 2012, 29(6): 82-97.
  • 4Eide E, Gish H. A parametric approach to vocal tract length normalization [C]// IEEE International Confer- ence on Acoustics, Speech, and Signal Processing. At- lanta, USA, 1996: 346-348.
  • 5Gauvain J L, Lee C H. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains[J]. IEEE Transactions on Speech and Audio Processing, 1994, 2(2): 291-298.
  • 6Leggetter C J, Woodland P C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models [J]. Computer Speech and Language, 1995, 9(2): 171-185.
  • 7Li Bo, Sim Khe Chai. Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems [C] // Proceedings of the llth Annual Conference of the International Speech Communication Association. Makuhari, Japan, 2010: 526-529.
  • 8Gemello R, Mana F, Scanzio S, et al. Linear hiddentransformations for adaptation of hybrid ANN/HMM models[J]. Speech Communication, 2007, 49(10) ~ 827-835.
  • 9Seide F, Li G, Chen X, et al. Feature engineering in context-dependent deep neural networks for conversa- tional speech transcription EC~//2011 IEEE Workshop on Automatic Speech Recognition and Understanding. Ha- waii, USA, 2011: 24-29.
  • 10Yao K, Yu D, Seide F, et al. Adaptation of context dependent deep neural networks for automatic speech recognition EC]// Spoken Language Technology Work- shop. Miami, USA, 2012: 366-369.

同被引文献46

  • 1邹政达,孙雅明,张智晟.基于蚁群优化算法递归神经网络的短期负荷预测[J].电网技术,2005,29(3):59-63. 被引量:46
  • 2南建设.信号细微特征分析技术研究[J].电讯技术,2007,47(2):68-71. 被引量:10
  • 3唐贤伦,庄陵,李银国,曹长修.混合粒子群优化算法优化前向神经网络结构和参数[J].计算机应用研究,2007,24(12):91-93. 被引量:15
  • 4Hinton G,Deng L,Yu D,et al.Deep neural networks for acoustic modeling in speech recognition[J].IEEE Signal Processing Magazine,2012,29(6):82-97.
  • 5Mohamed A,Dahl G,Hinton G.Acoustic modeling using deep belief networks[J].IEEE Trans.on Audio,Speech,and Language Processing,2012,20(1):14-22.
  • 6Deng L,Yu D,Platt J.Scalable stacking and learning for building deep architectures[C]//ICASSP.Kyoto,Japan:IEEE Press,2012:2133-2136.
  • 7Liu C,Zhang Z,Wang D.Pruning Deep Neural Networks by Optimal Brain Damage[C]//Proc Interspeech.Singapore,2014.
  • 8LeCun Y,Denker J,Solla S,et al.Optimal brain damage[J].Advances in Neural Information Processing Systems(NIPS),1989,2:598-605.
  • 9Li J,Zhao R,Huang J,et al.Learning Small-Size DNN with Output-Distribution-Based Criteria[C]//Proc Interspeech.Singapore,2014.
  • 10Xue J,Li J,Gong Y.Restructuring of deep neural network acoustic models with singular value decomposition[C]//Proc Interspeech.Lyon,France,2013.

引证文献5

二级引证文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部