摘要
研究一种基于神经网络的端到端中文语音识别算法.算法将语音信息处理为频谱图,基于频谱图,设计和实现一种基于卷积神经网络和循环神经网络的深度学习模型结构用于中文语音识别.模型以汉字作为标签样本,运用训练算法和序列损失函数进行模型迭代训练最终模型;采用开源数据集,通过实验验证网络结构对识别效果的影响,同时对比传统的语音识别算法,取得更加优异的识别效果,消耗更少的训练时间.
A deep learning based end-to-end Chinese automatic voice recognition model is proposed in this paper. The raw voice signal is firstly converted to spectrogram. Then a convolutional neural network and recurrent neural network combined structure is designed and implemented to translate Chinese audio to texts. The label of our model is the single Chinese character, with the proper loss function and training algorithm applied to train the recognition model iteratively. Taking an open dataset as training samples to test the influence of neural network structure, we also do tests to compare with the traditional methods. The experimental results show that our proposed model obtains more accuracy recognition and consumes less time for the training procedure.
作者
代伟
刘洪
DAI Wei;LIU Hong(College of Artificial Intelligence,Neijiang Normal University,Neijiang 641112,Sichuan;College of Computer Science,Sichuan University,Chengdu 610065,Sichuan)
出处
《四川师范大学学报(自然科学版)》
CAS
2022年第1期131-135,共5页
Journal of Sichuan Normal University(Natural Science)
基金
国家自然科学基金(71573184)。
关键词
语音识别
频谱图
卷积神经网络
循环神经网络
voice recognition
spectrogram
convolutional neural network
recurrent neural network