摘要
针对基于深度神经网络的端对端的语音识别技术展开研究,通过深度全序列卷积神经网络(DFCNN)声学模型和Transformer语言模型搭建一种端对端的语音识别系统。该系统完成对模型及数据的训练,实现对多字符中文语音的识别,并对隐马尔可夫语音识别方法和深度神经网络下的语音识别方法的系统搭建难度、原理差异和识别精确度进行对比研究。仿真结果表明,所提方法能够实现对连续多字符中文语音的有效识别,识别正确率在90%以上。
End to end speech recognition technology based on deep neural network is researched,an end-to-end speech recognition system is built by means of the deep full convolutional neural network(DFCNN)acoustic model and Transformer language model.This system can complete the training of models and data,and realize the recognition of multi-character Chinese speech.The difficulty,principle differences,and recognition accuracy of system construction between hidden Markov speech recognition methods and deep neural network speech recognition methods are compared.The simulation results show that the proposed method can complete the effective recognition of continuous multi-character Chinese speech,and the recognition accuracy is more than 90%.
作者
薛雅洁
贺红霞
杨祎
XUE Yajie;HE Hongxia;YANG Yi(School of Communication and Information Engineering,Xi’an University of Posts and Telecommunications,Xi’an 710061,China)
出处
《现代电子技术》
2023年第24期79-84,共6页
Modern Electronics Technique
基金
西安市科技计划项目(101/203010002)。