基于Deep Speech的语音识别系统的实现与改进

Implementation and Improvement of Speech Recognition System Based on Deep Speech

下载PDF

导出

摘要 Deep Speech是一个端到端的语音识别系统,该系统使用深度学习的方法取代了传统的特征提取方法,直接从根据波形文件产生的频谱图中提取特征生成对应的文字信息。该系统使用门限循环单元构建的循环神经网络能够对具有时间序列相关性的语音信息进行学习,还使用了CTC进行输入到输出的映射以及网络模型参数的更新。将这种方法与语言模型相结合之后,对单词的拼写错误进行修正,能够得到更好的识别效果,使用方法也更加简单。 Deep Speech is an end-to-end speech recognition system that uses adepth-of-learning method instead of a tradi-tional feature extraction method to generate the corresponding textual information directly from the spectral map generated from the waveform file. The cyclic neural network constructed by the threshold cycle unit can be used to study the speech information with time series correlation. It also uses the CTC to perform the input to output mapping and the updating of the network model parame-ters. Combining this method with the language model,it can correct the misspelling of the word and get a better recognition result, and the method is more simple.

作者李灿孙浩李开

机构地区昆明长水国际机场动力能源部华中科技大学计算机科学与技术学院

出处《计算机与数字工程》 2017年第8期1620-1624,共5页 Computer & Digital Engineering

关键词语音识别深度学习循环神经网络 CTC 门限循环单元随机梯度下降语言模型 speech recognition deep learning recurrent neural network CTC gated recurrent unit random gradient de-scent language model

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1樱子.新版Windows 10类似功能“超前”体验[J].电脑爱好者,2017,0(16):26-26.
2任秀丽,杨建军.基于实际离散制造数据的单元化制造系统构建[J].机械工程与自动化,2017(4):64-65. 被引量：1

计算机与数字工程

2017年第8期

浏览历史

内容加载中请稍等...

基于Deep Speech的语音识别系统的实现与改进

相关作者

相关机构

相关主题

浏览历史