期刊文献+

卷积神经网络算法在语音识别中的应用 被引量:15

Application of convolutional neural network algorithm in speech recognition
下载PDF
导出
摘要 随着互联网信息指数性增加,海量语音数据的特征具有很大的非特定人差异性和噪声干扰性,常用的特征提取以及特征变换方法已经很难满足当前模型训练识别的需求。近些年来立足于语音识别和深度学习理论的紧密结合,通过研究发现卷积神经网络的结构十分适合语音信号的特征提取过程,文中提出一种基于卷积神经网络的特征提取方法,并且结合相对复杂的GMM-HMM模型组成新的语音识别系统。实验表明,卷积神经网络结构可以很好的克服语音信号的非特定人差异性以及噪声的影响,GMM-HMM模型相比softmax分类器更为切合语音复杂信号的建模,最终识别率有了很大的提升。 With the increase of Internet information grows exponentially, huge amounts of voice and data features has a great deal of speaker-independent, difference and noise interference methods of feature extraction and feature transformation is hard to meet the needs of the current training model identification. In recent years based on speech recognition and deep learning theory together, through the study it found that the convolutional neural network structure is very suitable for speech signal feature extraction process, this paper proposes a feature extraction method based on convolution neural network, and the combination of relatively complex GMM-HMM model of the new voice recognition system. The experiments show that the convolution neural network structure can be very good to overcome the differences between speaker-independent speech signals and the influence of noise, GMM-HMM model is more relevant than soflmax classifier in speech complex signal model area, the final recognition rate had the very big improvement.
作者 张文宇 刘畅 ZHANG Wen-yu;LIU Chang(School of Economics and Management of Xi'an University of Posts & Telecommunications,Xi'an 710061,China)
出处 《信息技术》 2018年第10期147-152,共6页 Information Technology
关键词 特征提取 卷积神经网络 语音识别 feature extraction convolution neural network speech recognition
  • 相关文献

参考文献8

二级参考文献206

  • 1张晨燕,孙成立.非特定人孤立词语音识别系统的片上实现[J].计算机工程与应用,2007,43(13):194-196. 被引量:10
  • 2关胜平,何培宇,刘珂含,李锦,田芳芳,王三山.基于TMS320VC5509A的语音识别与控制系统[J].电子技术应用,2007,33(7):36-39. 被引量:6
  • 3刘幺和,宋庭新.语音识别与控制应用技术[M].北京:科学出版社,2008.
  • 4Fenn J, Clark W, Natis Y V, et al. Hype cycle for emerging technologies, 2009. Stamford: Gartner, 2009.
  • 5Fenn J, LeHong H. Hype cycle for emerging technologies, 2011. Stamford: Gartner, 2011.
  • 6Uebel L F, Woodland P C. An investigation into vocal tract length normalization. In: Proceedings of European Conference on Speech Communication and Technology, Budapest, 1999. 2527-2530.
  • 7Povey D, Kingsbury B, Mangu L, et al. fMPE: discriminatively trained features for speech recognition. In: Proceedings of ICASSP 2005, Philadelphia, 2005. 961-964.
  • 8Povey D, Kanevsky D, Kingsbury B, et al. Boosted MMI for model and feature-space discriminative training. In: Proceedings of ICASSP 2008, Las Vegas, 2008. 4057-4060.
  • 9Mermelstein P. Distance measures for speech recognition, psychological and instrumental. Pattern Recogn Artif Intell, 1976, 116: 374-388.
  • 10Bridle J S, Brown M D. An experimental automatic word recognition system. JSRU Report 1003: 5, 1974.

共引文献696

同被引文献145

引证文献15

二级引证文献69

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部