摘要
为了解决传统卷积神经网络识别连续语音数据时识别性能较差的问题,提出一种改进的卷积神经网络算法。该方法引入Fisher准则以及L2正则化约束,在反向传播调整参数阶段,既保证参数误差的最小化,又确保分类以后的样本类间分布较分散,类内分布较集中,同时保证网络权值具有合适的数量级以有效缓解过拟合问题;采用一种更符合生物神经元激活特性的新型log激活函数进行卷积神经网络的优化,进一步提高语音识别的正确率。在语音识别库TIMIT以及THCHS30上的实验结果表明,相较于传统卷积神经网络算法,该文提出的改进算法能较好地提高语音识别率,且泛化能力更强。
An improved convolutional neural network(CNN)algorithm is proposed to solve the problem of poor recognition performance when the traditional CNN identifies continuous speech corpus.In this method,Fisher criterion and L2 regularization constraint are introduced.In the phase of back propagation adjustment parameters,it not only ensures the minimum of parameter errors,but also ensures that the distribution of samples after classification is more scattered,and the distribution within class is more concentrated.At the same time,the weights of the network are guaranteed to have the appropriate order of magnitude to effectively alleviate the problem of over-fitting.In order to further improve the accuracy of speech recognition,a new log activation function which is more consistent with the biological neuron is used to optimize the CNN.Experiments on speech corpus TIMIT and THCHS30 show that compared with the traditional CNN algorithm,the improved algorithm proposed in this paper can better improve the accuracy and the generalization ability.
作者
杨洋
汪毓铎
YANG Yang;WANG Yuduo(School of Information and Communication Engineering,Beijing Information Science and Technology University,Beijing 100101,China)
出处
《应用声学》
CSCD
北大核心
2018年第6期940-946,共7页
Journal of Applied Acoustics