期刊文献+

基于残差卷积神经网络的语音识别算法

Speech Recognition Algorithm Based on Residual Convolutional Neural Network
下载PDF
导出
摘要 传统语音识别声学模型DFCNN在对语音特征进行提取的时,采用深度卷积模型只考虑了局部特征,对不同的声学特征无法抓重点选择,且训练速度较慢,难以收敛。文本针对这些问题,提出一种基于深度残差的卷积神经网络的声学模型DRCNN。结合CTC技术,直接使用DRCNN对声学特征进行建模,使用SE-Block通道加权残差机制和深度堆叠结构,加快声学特征提取过程,增强拟合能力,提高训练速度。在此基础上搭建基于transformer的语言模型。相比传统DFCNN-HMM模型,更能学习到语音信息的深度特征,增强声学模型,语言模型鲁棒性。实验结果表明,在中文语音识别数据集,文本提出的语音识别算法相比DFCNN-HMM有在字错误率WER上有4.03%的提升。 When the traditional speech recognition acoustic model DFCNN extracts speech features,the deep convolution model only considers local features,and cannot focus on different acoustic features,and the training speed is slow and difficult to converge.In response to these problems,the text proposes an acoustic model DRCNN based on deep residual convolutional neural network.Combining CTC technology,DRCNN is directly used to model acoustic features,SE-Block channel weighted residual mechanism and deep stacking structure are used to speed up the acoustic feature extraction process,enhance the fitting ability,and increase the training speed.On this basis,a transformer-based language model is built.Compared with the traditional DFCNN-HMM model,it can learn the in-depth features of speech information and enhance the robustness of the acoustic model and language model.The experimental results show that in the Chinese speech recognition data set,the speech recognition algorithm proposed by the text has a 4.03%improvement in the word error rate WER compared to DFCNN-HMM.
作者 冯成立 程雯 FENG Chengli;CHENG Wen(Wuhan Research Institute of Posts&Telecommunications,Wuhan 430000)
出处 《计算机与数字工程》 2023年第2期440-444,共5页 Computer & Digital Engineering
关键词 语音识别 CNN TRANSFORMER 自注意力机制 残差链接 SE-Block speech recognition CNN transformer self attention residual connection SE-Block
  • 相关文献

参考文献7

二级参考文献33

  • 1俞铁城 周健来 等.基于神经网络/隐马尔可夫模型的混合语音识别方法的研究现状.第5届全国人机语音通讯学术会议论文集[M].哈尔滨,1998.18-21.
  • 2李全在 陈道文.基于混合HMM/ANN方法的汉语连续数字识别系统.第5届全国人机语音通讯学术会议论文集[M].哈尔滨,1998.166-168.
  • 3李全在,第五届全国人机语音通讯学术会议论文集,1998年,166页
  • 4俞铁城,第五届全国人机语音通讯学术会议论文集,1998年,18页
  • 5郭柏灵,蒲学科,黄凤辉.分数阶偏微分方程及其数值解[M].北京:科学出版社,2011.
  • 6吴炜然.基于神经网络语音识别算法的研究[D].长沙:中南大学,2009.
  • 7董龚.基于HMN的嵌人式特定人语音识别系统[D].哈尔滨:哈尔滨工业大学,2013.
  • 8CHENG O,ABDULLA W,SALCIC Z. Hardware-software Codesign of Automatic Speech Recognition System [ J ]. Industial Electron- ies,2011,58(3) :850 -859.
  • 9WEN Cheyen, CHIU Shihhsuan, HSU Weisheng. Defect Segmenta- tion of Texture Images with Wavelent Transform and a Co-occur- rence Matrix [ J ]. Textile Research Journal, 2001,71 ( 8 ) : 743 - 749.
  • 10PEI S C, DING J J. Closed-form Discrete Fractional and Mfine Fourier Transforms [ J ]. IEEE Trans on Signal Process, 2000,48 (5) : 1338 - 1553.

共引文献87

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部