期刊文献+

基于多尺度残差深度卷积神经网络的语音识别 被引量:10

SPEECH RECOGNITION BASED ON MUTI-SCALE RESIDUAL DEEP CONVOLUTIONAL NEURAL NETWORK
下载PDF
导出
摘要 针对卷积神经网络在连续语音识别中识别性能较差的问题,提出多尺度残差深度卷积神经网络的语音识别的算法,并结合联结时序分类算法,构建端到端中文语音识别系统。将多尺度学习和残差机制以及空洞卷积引入到神经网络中,摆脱序列建模对长短时记忆神经网络的依赖,提高模型的训练速度,增强语音识别的抗噪声干扰性。实验表明,与双向长短时记忆模型(BLSTM)、深度卷积神经网络模型(DCNN)和卷积神经网络-长短时记忆模型(CNN-LSTM)相比,该模型的字错误率WER(Word Error Rate)分别降低了9%、5%和3%左右,且在噪声环境下的识别率也优于传统的语音识别系统。 To solve the problem of poor performance of convolutional neural networks in continuous speech recognition,this paper proposes an algorithm based on a multi-scale residual deep convolutional neural network,and constructs an end-to-end speech recognition system for Chinese,by integrating connectionist temporal classification into the algorithm.The multi-scale learning,residual mechanism,and dilated convolution were introduced into the neural network to eliminate the dependence of sequence modeling on LSTM,improve the training speed of the model,and enhance the anti-noise interference of speech recognition.Experiments show that compared with BLSTM,DCNN and CNN-LSTM,the WER of this model is reduced by 9%,5%and 3%respectively,and the recognition rate in noisy environment is better than that in traditional speech recognition system.
作者 刘虹 袁三男 Liu Hong;Yuan Sannan(School of Electronics and Information Engineering,Shanghai University of Electric Power,Shanghai 200090,China)
出处 《计算机应用与软件》 北大核心 2020年第11期275-279,共5页 Computer Applications and Software
关键词 语音识别 多尺度 卷积神经网络 端到端 Speech recognition Multi-scale Convolutional neural network End-to-end
  • 相关文献

参考文献5

二级参考文献15

共引文献62

同被引文献92

引证文献10

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部