期刊文献+

Adaptive bands filter bank optimized by genetic algorithm for robust speech recognition system 被引量:5

Adaptive bands filter bank optimized by genetic algorithm for robust speech recognition system
下载PDF
导出
摘要 Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems.However,the problem of the design of optimized filter banks that provide higher accuracy in recognition tasks is still open.Owing to spectral analysis in feature extraction,an adaptive bands filter bank (ABFB) is presented.The design adopts flexible bandwidths and center frequencies for the frequency responses of the filters and utilizes genetic algorithm (GA) to optimize the design parameters.The optimization process is realized by combining the front-end filter bank with the back-end recognition network in the performance evaluation loop.The deployment of ABFB together with zero-crossing peak amplitude (ZCPA) feature as a front process for radial basis function (RBF) system shows significant improvement in robustness compared with the Bark-scale filter bank.In ABFB,several sub-bands are still more concentrated toward lower frequency but their exact locations are determined by the performance rather than the perceptual criteria.For the ease of optimization,only symmetrical bands are considered here,which still provide satisfactory results. Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems. However, the problem of the design of optimized filter banks that provide higher accuracy in recognition tasks is still open. Owing to spectral analysis in feature extraction, an adaptive bands filter bank (ABFB) is presented. The design adopts flexible bandwidths and center frequencies for the frequency responses of the filters and utilizes genetic algorithm (GA) to optimize the design parameters. The optimization process is realized by combining the front-end filter bank with the back-end recognition network in the performance evaluation loop. The deployment of ABFB together with zero-crossing peak amplitude (ZCPA) feature as a front process for radial basis function (RBF) system shows significant improvement in robustness compared with the Bark-scale filter bank. In ABFB, several sub-bands are still more concentrated toward lower frequency but their exact locations are determined by the performance rather than the perceptual criteria. For the ease of optimization, only symmetrical bands are considered here, which still provide satisfactory results.
出处 《Journal of Central South University》 SCIE EI CAS 2011年第5期1595-1601,共7页 中南大学学报(英文版)
基金 Project(61072087) supported by the National Natural Science Foundation of China Project(20093048) supported by Shanxi ProvincialGraduate Innovation Fund of China
关键词 语音识别系统 自适应滤波器 优化过程 滤波器组 遗传算法 银行 频带 设计参数 perceptual filter banks bark scale speaker independent speech recognition systems zero-crossing peak amplitude genetic algorithm
  • 相关文献

参考文献16

  • 1ATAL B S. Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification [J]. Journal of the Acoustical Society of America, 1974, 55(6): 1304-1312.
  • 2DAVIS S, MERMELSTEIN P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences [J]. IEEE Transaction on Acoustics, Speech and Signal Processing, 1980, 28(4): 357-366.
  • 3KIM D S, LEE S Y, KIL R M. Auditory processing of speech signal for robust speech recognition in real-world noisy environments [J]. IEEE Transaction on Speech and Audio Processing, 1999, 7(1): 55-69.
  • 4JUANG B H, RABINER L R. Hidden Markov models for speech recognition [J]. Technometrics, 1991, 33(3): 251-272.
  • 5BROOMHEAD D S, LOWE D. Multivariable functional interpolation and adaptive networks [J]. Complex Systems, 1988, 2(3): 321-355.
  • 6SAYOUD H, OUAMOUR S. Speaker clustering of stereo audio documents based on sequential gathering process [J]. Join'hal of Information Hiding and Multimedia Signal Processing, 2010, 1(4): 344-360.
  • 7HANDEL S. Listening: An introduction to the perception of auditory events [M]. Massachusetts: MIT Press, t993: 461-546.
  • 8STROPE B, ALWAN A. A model of dynamic auditory perception and its application to robust word recognition [J]. IEEE Transaction on Speech and Audio Processing, 1997, 5 (5): 451-464.
  • 9HOLMBERG M, GELBART D, HEMMERT W. Automatic speech recognition with an adaptation model motivated by auditory processing [J]. IEEE Transaction on Audio, Speech, Language Processing, 2006, 14(1): 44-49.
  • 10ZHANG Xue-ying, HUANG Li-xia, EVANGELISTA G. Warped filter banks used in noisy speech recognition [C]// Proceedings of Innovative Computing, Information and Control. Kaohsiung: IEEE, 2009: 1385-1388.

二级参考文献8

共引文献7

同被引文献49

引证文献5

二级引证文献38

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部