期刊文献+

基于改进卷积神经网络算法的语音识别 被引量:26

Speech recognition based on improved convolutional neural network algorithm
下载PDF
导出
摘要 为了解决传统卷积神经网络识别连续语音数据时识别性能较差的问题,提出一种改进的卷积神经网络算法。该方法引入Fisher准则以及L2正则化约束,在反向传播调整参数阶段,既保证参数误差的最小化,又确保分类以后的样本类间分布较分散,类内分布较集中,同时保证网络权值具有合适的数量级以有效缓解过拟合问题;采用一种更符合生物神经元激活特性的新型log激活函数进行卷积神经网络的优化,进一步提高语音识别的正确率。在语音识别库TIMIT以及THCHS30上的实验结果表明,相较于传统卷积神经网络算法,该文提出的改进算法能较好地提高语音识别率,且泛化能力更强。 An improved convolutional neural network(CNN)algorithm is proposed to solve the problem of poor recognition performance when the traditional CNN identifies continuous speech corpus.In this method,Fisher criterion and L2 regularization constraint are introduced.In the phase of back propagation adjustment parameters,it not only ensures the minimum of parameter errors,but also ensures that the distribution of samples after classification is more scattered,and the distribution within class is more concentrated.At the same time,the weights of the network are guaranteed to have the appropriate order of magnitude to effectively alleviate the problem of over-fitting.In order to further improve the accuracy of speech recognition,a new log activation function which is more consistent with the biological neuron is used to optimize the CNN.Experiments on speech corpus TIMIT and THCHS30 show that compared with the traditional CNN algorithm,the improved algorithm proposed in this paper can better improve the accuracy and the generalization ability.
作者 杨洋 汪毓铎 YANG Yang;WANG Yuduo(School of Information and Communication Engineering,Beijing Information Science and Technology University,Beijing 100101,China)
出处 《应用声学》 CSCD 北大核心 2018年第6期940-946,共7页 Journal of Applied Acoustics
关键词 语音识别 卷积神经网络 FISHER准则 L2正则化 log激活函数 Speech recognition,Convolutional neural network Fisher criterion L2 regularization log activation function
  • 相关文献

参考文献4

二级参考文献34

  • 1SARIKAYA R, HINTON G E, DEORAS A. Application of deep belief networks for natural language understanding [ J]. IEEE Transactions on Audio Speech and Language Processing, 2014, 22 (4) : 778-784.
  • 2GRAVES A, MOHAMED A, HINTON G E. Speech recognition with deep recurrent neural networks [ C ] // IEEE International Conference on Acoustic Speech and Signal Processing ( ICASSP 2013 ). Vancouver, BC: IEEE, 2013: 6645-6649.
  • 3CIRESAN D, MEIER U, SCHMIDHUBER J. Multi- column deep neural networks for image classification [ C ]// Computer Vision and Pattern Recognition. Providence, RI: IEEE, 2012: 3642-3649.
  • 4JI Shui-wang, XU Wei, YANG Ming, et al. 3D convolutional neural networks for human action recognition [J]. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2012, 35( 1 ) : 221-231.
  • 5JOAN B, ARTHUR S, LECUN Y. Signal recovery from Lp pooling representations [ C ] //International Conference on Machine Learning (ICML2014). Beijing: IMLS, 2014.
  • 6JONATAN T, MURPHY S, LECUN Y, et al. Real-time continuous pose recovery of human hands using convolutional networks [ J ]. ACM Transaction on Graphics, 2014, 33(5).
  • 7TIVIVE F H C, BOUZERDOUM A. A new class of convolutional neural networks (SICoNNets) and their application of face detection [ C ] // Proceedings of the International Joint Conference on Neural Networks. Portland: IEEE, 2003: 2157-2162.
  • 8SIMARD P Y, STEINKRAUS D, PLATT J C. Best practice for convolutional neural networks applied to visual document analysis [ C ] //Seventh International Conference on Document Analysis and Recognition. Edinburgh: IEEE, 2003: 958-963.
  • 9SKITTANON S, SURENDRAN A C, PLATT J C, et al. Convolutional networks for speech detection [ C ] // Interspeeeh. Lisbon, Portugal: ISCA, 2004: 1077-1080.
  • 10CHEN Ying-nong, HAN Chin-chuan, WANG Cheug-tzu, et al. The application of a convolution neural network on face and license plate detection [ C ]//18th International conference on Pattern Recognition. Hong Kong: IEEE, 2006 : 552-555.

共引文献121

同被引文献174

引证文献26

二级引证文献123

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部