摘要
随着互联网信息指数性增加,海量语音数据的特征具有很大的非特定人差异性和噪声干扰性,常用的特征提取以及特征变换方法已经很难满足当前模型训练识别的需求。近些年来立足于语音识别和深度学习理论的紧密结合,通过研究发现卷积神经网络的结构十分适合语音信号的特征提取过程,文中提出一种基于卷积神经网络的特征提取方法,并且结合相对复杂的GMM-HMM模型组成新的语音识别系统。实验表明,卷积神经网络结构可以很好的克服语音信号的非特定人差异性以及噪声的影响,GMM-HMM模型相比softmax分类器更为切合语音复杂信号的建模,最终识别率有了很大的提升。
With the increase of Internet information grows exponentially, huge amounts of voice and data features has a great deal of speaker-independent, difference and noise interference methods of feature extraction and feature transformation is hard to meet the needs of the current training model identification. In recent years based on speech recognition and deep learning theory together, through the study it found that the convolutional neural network structure is very suitable for speech signal feature extraction process, this paper proposes a feature extraction method based on convolution neural network, and the combination of relatively complex GMM-HMM model of the new voice recognition system. The experiments show that the convolution neural network structure can be very good to overcome the differences between speaker-independent speech signals and the influence of noise, GMM-HMM model is more relevant than soflmax classifier in speech complex signal model area, the final recognition rate had the very big improvement.
作者
张文宇
刘畅
ZHANG Wen-yu;LIU Chang(School of Economics and Management of Xi'an University of Posts & Telecommunications,Xi'an 710061,China)
出处
《信息技术》
2018年第10期147-152,共6页
Information Technology
关键词
特征提取
卷积神经网络
语音识别
feature extraction
convolution neural network
speech recognition