摘要
使用录音设备对1605个常用汉字进行录音,得到920个孤立字发音、3680个非特定人的语音样本库.采用语音语谱图作为汉语单字语音识别的特征,构建了6层卷积神经网络应用于模型库的语音识别.通过深度学习方法对语音样本进行了训练和识别.实验结果表明,所构造的20-40-3500结构的卷积神经网络模型对语音样本库具有最好的识别效果,对测试样本的识别率达到97.87%,对全部样本的识别率达到99.32%.
1605 common Chinese characters were recorded by recording equipment,920 isolated words and 3680 speaker-independent speech sample library were obtained.Using speech spectrogram as the feature of individual Chinese character,a 6-layer convolutional neural network is constructed and applied to the speech recognition of sample library.The deep learning method is used to train network structure and recognize speech samples.The experimental results show that the 20-40-3500 convolutional neural network model has the best performance on the speech sample library,with the recognition rate of test samples reaching 97.87%and the recognition rate of all samples reaching 99.32%.
作者
白璐
王连明
BAI Lu;WANG Lian-ming(Institute of Computational Intelligence,Northeast Normal University,Changchun 130024,China)
出处
《东北师大学报(自然科学版)》
CAS
北大核心
2020年第2期52-57,共6页
Journal of Northeast Normal University(Natural Science Edition)
基金
国家自然科学基金资助项目(21227008).
关键词
卷积神经网络
语音识别
语谱图
深度学习
convolutional neural network
speech recognition
spectrogram
deep learning