期刊文献+

基于声道频谱参数的语种识别 被引量:11

Language Identification Based on Vocal Tract Spectrum Parameters
原文传递
导出
摘要 针对低信噪比下语种识别正确率低的问题,提出了一种声道冲激响应频谱参数和Teager能量算子倒谱参数融合的识别方法.根据语音中不同特征信息量分布特性,首先在特征提取前端引入低通滤波器滤除信号高频部分,并采用重采样方法降低采样率,再基于信号频谱提取声道冲激响应频谱参数,然后融合Teager能量算子倒谱参数,最后通过高斯混合通用背景模型进行语种识别验证.不同信噪比条件下性能测试表明,所提方法相对于基于单一的梅尔频率倒谱系数特征、单一的伽玛通频率倒谱系数特征和基于对数梅尔尺度滤波器组能量特征,在低信噪比下提升约15 dB,显著提高了识别正确率. Aiming at the problem of low accuracy of language identification under low signal to noise ratio,a fusion identification method is proposed,using spectral parameters of channel impulse response and Teager energy operators cepstral coefficients.Considering the distribution of different feature information in speech,a low-pass filter is introduced to filter out the high-frequency part of the signal in the front-end of feature extraction.The resampling method is used to reduce the rate.And then,the spectral parameters of channel impulse response of vocal tract are extracted,and fused with the Teager energy operators cepstral coefficients.Finally,a Gaussian mixture model-universal background model is used to perform the language identification.Experiments under different signal to noise ratio conditions show that the proposed methold significantly improves the language identification accuracy with 15 dB gain at low signal to noise ratio compared with the single Mel frequency cepstrum coefficient feature,single Gammatone frequency cepstrum coefficient feature and log Mel-scale filter bank energies feature.
作者 邵玉斌 刘晶 龙华 杜庆治 李一民 SHAO Yu-bin;LIU Jing;LONG Hua;DU Qing-zhi;LI Yi-min(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)
出处 《北京邮电大学学报》 EI CAS CSCD 北大核心 2021年第3期112-119,共8页 Journal of Beijing University of Posts and Telecommunications
基金 国家自然科学基金项目(61761025)。
关键词 语种识别 声道冲激响应频谱参数 低通滤波 重采样 Teager能量算子倒谱参数 language identification spectral parameters of channel impulse response low-pass filtering resampling Teager energy operators cepstral coefficients
  • 相关文献

参考文献5

二级参考文献27

  • 1岳倩倩,周萍,景新幸.基于非线性幂函数的听觉特征提取算法研究[J].微电子学与计算机,2015,32(6):163-166. 被引量:5
  • 2姜洪臣,郑榕,张树武,徐波.基于SDC特征和GMM-UBM模型的自动语种识别[J].中文信息学报,2007,21(1):49-53. 被引量:14
  • 3Zissman M A. Comparison of four approaches to automatic language identification of telephone speech [J]. IEEE Transactions on Speech and Audio Processing, 1996, 4(1): 31 - 44.
  • 4Li H, Ma B, Lee C H. A vector space modeling approach to spoken language identification [J]. IEEE Transactions on Audio, Speech and Language Processing, 2007, 15(1): 271 - 284.
  • 5Huang X D, Acero A, Hon H W. Spoken Language Processing [M]. Upper Saddle River, NJ: Prentice Hall PTR, 2000.
  • 6Abdulla W H. Auditory based feature vectors for speech recognition systems [J]. Advances in Communications and Software Technologies, 2002: 231- 236.
  • 7Li Q, Soong F, Siohan O. A high-performance auditory feature for robust speeeh recognition [C]//Proe 6th Int Conf on Spoken Language Processing. Beijing: China Military Friendship Publish, 2000, Ⅲ: 51- 54.
  • 8Colombi J M, Anderson T R, Rogers S K. Auditory model representation for speaker recognition [C]//Proc ICASSP. Piscataway, NJ: IEEE Press, 2006, Ⅱ:700-703.
  • 9Glasberg B R, Moore B C. Derivation of auditory filter shapes from notched-noise data [J]. Hearing Research, 1990, 47(1-2): 103-108.
  • 10Slaney M. An efficient implementation of the Patterson-Holdsworth auditory filter bank [R]. Apple Computer Inc, 1993.

共引文献36

同被引文献55

引证文献11

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部