期刊文献+

基于浊音语音谐波谱子带加权重建的抗噪声说话人识别 被引量:5

Robust speaker recognition based on harmonic spectrum reconstruction of voiced speech
下载PDF
导出
摘要 提出了一个基于浊音语音谐波谱重建的说话人识别算法.该算法根据浊音语音短时频谱的结构特征和基音信息,对浊音语音谐波结构频谱进行子带加权重建,以补偿由噪声引起的训练与测试条件的失配.算法基于重建浊音频谱提取感知线性预测倒谱系数,与基音相组合作为说话人的语音特征参数矢量,采用高斯混合模型对说话人进行建模.仿真实验的结果表明:所提出的浊音谱重建方法对多种类型含噪语音的噪声补偿均具良好效果,可以明显提高在噪声环境下的与文本无关的说话人识别的识别率,特别是显著提高低信噪比环境下的识别率,而不会明显降低纯净语音和高信噪比环境下的识别率. A speaker recognition algorithm based on harmonic spectrum reconstruction of voiced speech is proposed.In the proposed approach,according to the spectral character and pitch information of original speech,the harmonic spectrum of voiced segment is reconstructed with the sub-band weighting method to compensate the acoustic mismatches caused by noises between training and testing conditions.The perceptual linear predictive cepstrum coefficient is extracted from the reconstructed spectrum and is combined with pitch to form a speech feature vector of a giving speaker.Speaker is modeled by Gaussian mixture model.Simulation results indicate that the approach of the voiced speech spectrum reconstruction proposed in this paper is very effective for the noise compensation in many noisy speech conditions.For the text independent speaker recognition,the recognition accuracy is significantly improved by this method in the noisy environments,especially in low SNR environments,and there is no remarkable degradation in clean speech and high SNR environments.
出处 《东南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2008年第6期935-941,共7页 Journal of Southeast University:Natural Science Edition
基金 国家重点基础研究发展计划(973计划)资助项目(2002CB312102) 江苏省普通高校自然科学研究计划资助项目(07KJD510110).
关键词 说话人识别 频谱重建 感知线性预测倒谱系数 噪声补偿 谱平坦度测度 speaker recognition spectrum reconstruction perceptual linear predictive cepstrum coefficient noise compensation spectral flatness measure
  • 相关文献

参考文献13

  • 1Solomonoff A, Campbell W, Boardman I. Advances in channel compensation for SVM speaker recognition [ C ]//Proceeding of IEEE ICASSP-2005. Philadelphia, USA, 2005 : 629 - 632.
  • 2Hermansky H, Morgan N. RASTA processing of speech [ J ]. IEEE Transactions on Speech and Audio Processing, 1994,2(4): 578-589.
  • 3Poruba J. Speech enhancement based on nonlinear spectral subtraction [ C ]//Proceedings of 1EEE ICCDCS'02. Piscataway, USA, 21X12: 1 - 4.
  • 4Rose R, Hofstetter E. Integrated models of signal and background with application to speaker identification in noise [ J ]. IEEE Transactions on Speech and Audio Processing, 1994, 2(2) : 245 -257.
  • 5Deng L, Droppo J, Acero A. Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion [ J ]. IEEE Transactions on Speech and Audio Processing, 2005,13(3): 412-421.
  • 6Ming J. Noise compensation for speech recognition with arbitrary additive noise [ J ]. IEEE Transactions on Audio, Speech and Language Processing, 2006, 14 ( 3 ) : 833 - 844.
  • 7赵蕤,王作英.语音识别中信道和噪音的联合补偿[J].声学学报,2006,31(5):466-470. 被引量:11
  • 8Gong Y. A method of joint compensation of additive and convolutive distortions for speaker-independent speech recognition [ J ]. IEEE Transactions on Speech and Audio Processing, 2005, 13 (5) : 975 - 983.
  • 9Hermansky H. Perceptual linear predictive (PLP) analysis of speech [ J ]. The Journal of the Acoustic Society of America, 1994, 87(4) : 1738 - 1752.
  • 10Ding H, Qian B, Li Y, et al. A method combining LPC-based cepstrum and harmonic product spectrum for pitch detection [ C ]//Proceedings of ICIIH- MSP'06. Pasadena, USA, 2006. 537- 540.

二级参考文献12

  • 1Acero A, Stern R M. Environmental robustness in automatic speech recognition. In: Proc. IEEE Int. Conf.Acoustics, Speech and signal Processing, Albuquerque,NM, 1990; 1:849-852
  • 2Alejandro Acero. Acoustical and environmental robustness in automatic speech recognition. PH.D. Thesis. Department of Electrical and Computer Engineering CMU, AAT 9117502, 1990
  • 3Moreno P J. Speech recognition in noisy environments.PH.D. Thesis. Department of Electrical and Computer Engineering CMU, AAT 9625546, 1996
  • 4Kim D Y, Un C K, Kim N S. Speech recognition in noisy environments using first-order vector Taylor series. Speech Communication, 1998; 24(1): 39-49
  • 5Fujimoto, Masakiyo, Ariki, Yasuo. Robust speech recognition in additive and channel noise environments using GMM and EM algorithm. In: IEEE International Conference on Acoustics, Speech, and Signal Processing. Montreal, 2004; 1:I941-I944
  • 6Segura J C, Torre A de la, Benitez M C, Peinado A M.Model-based compensation of the additive noise for continuous speech recognition - experiments using AURORA Ⅱ database and tasks. EuroSpeech, 2001; 1:221-224
  • 7ZHAO Yunxin. Maximum likelihood joint estimation of channel and noise for robust speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings, Istanbulv, 2000; 2:1109-1112
  • 8WANG Zuoying. An inhomogeneous HMM speech recognition algorithm. Chinese Journal of Electronics, 1998; 7(1):73-74
  • 9赵庆卫,肖熙,王作英.段长信息在连续语音识别中的应用研究[J].声学学报,2000,25(2):175-181. 被引量:5
  • 10韩纪庆,高文.基于环境特征判别学习的顽健语音识别方法[J].电子学报,2001,29(2):196-198. 被引量:4

共引文献10

同被引文献48

引证文献5

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部