期刊文献+

基于修正MFCC的耳语说话人识别方法

Speaker Identification in Whispered Speech Based on Modified-MFCC
下载PDF
导出
摘要 在说话人识别中,美尔倒谱系数MFCC(Mel-Frequency Cepstral Coefficients)是一种常用的特征,但是这种通用的特征在耳语音的说话人识别上并不太理想。MFCC的三角滤波器组在Mel尺度上是均匀分布的,但是耳语音不同于正常音的发声,通过改变这种均匀分布的格局来改善耳语音说话人识别率,将全频域分成不同频段,分别调整各频段内滤波器的疏密程度,再将各频段的滤波器组合成新的滤波器组。修正后的滤波器模型在文本无关的耳语音说话人识别中相比原模型识别效果有所提高。 MFCC (Mel-Frequency Cepstral Coefficients) is a normally used feature in speaker recognition system. But such a common feature does not work well on whispered speech. The original MFCC was a bunch of triangular filter uniformly distributed in Mel dimension. This paper presents a new research method, by changing this uniform dis- tribution to improve ASR recognition rate on whispered voice, since its differently pronounced way other than the nor- mal voice. Experiments were done to analyze the effect of different recognition rate caused by different number of filters added to each single frequency region, and then the result of experiment was used to select a proper number of filters to be added to a specified frequency. Then, the combination of all filters will be the final model. The model is designed as an adaption to the voice source, and it shows a good performance in text-independent speaker recognition.
作者 丁国梁
出处 《苏州大学学报(工科版)》 CAS 2009年第4期59-64,共6页 Journal of Soochow University Engineering Science Edition (Bimonthly)
关键词 说话人识别 耳语音 MFCC 三角滤波器组 speaker identification whispered speech MFCC triangle filters
  • 相关文献

参考文献5

  • 1林玮,杨莉莉,徐柏龄.基于修正MFCC参数汉语耳语音的话者识别[J].南京大学学报(自然科学版),2006,42(1):54-62. 被引量:22
  • 2RL特拉斯克.语音学和音系学字典[M].《语音学和音系学字典》编译组,译.北京:语文出版社,2000:286.
  • 3Jovicic S T. Formant feature differences between whispered and voiced sustained vowels[J].Acustica-acta united with Acustica. 1998,84 (4) :739 - 743.
  • 4Matsuda M,Kasuya H. Acoustic nature of the whisper[ J]. Eurospeech, 1999 ( 1 ) :137 - 140.
  • 5Sahar E B, John H L H. A comparative study of tradition and newly proposed features for recognition of speech under stress [ J ]. IEEE Transactions of Speech and Signal Processing,2000,8(4) :429 -442.

二级参考文献12

  • 1栗学丽,丁慧,徐柏龄.基于熵函数的耳语音声韵分割法[J].声学学报,2005,30(1):69-75. 被引量:34
  • 2杨莉莉,李燕,徐柏龄.汉语耳语音库的建立与听觉实验研究[J].南京大学学报(自然科学版),2005,41(3):311-317. 被引量:13
  • 3Schwartz M F, Rine M F. Identification of speaker sex from isolated, whispered vowels. Journal of Acoustical Society of America, 1968, 44 ( 6 ) :1 736-1737.
  • 4Yu H. The whisper is not helpful for treating hoarseness and recovering voice. Journal of the Central University for Nationalities, 1996,5 ( 2 ) :163 - 166.
  • 5Itoh T, Takeda K,Itakura F. Acoustic analysis and recognition of whispered speech. Proceedings of ICASSP. Orlando, Florida, USA, 2002,389 -392.
  • 6Morris R W, Clements A M. Reconstruction of speech from whispers. Medical Engineering and Physics, 2002,24 (8) : 515 - 520.
  • 7Morris R W. Enhancement and recognition of whispered speech. PhD Thesis, Georgia Institute of Technology, 2002.
  • 8Li X L, Xu B L. Formant comparison between Mandarin whispered and voiced vowels. Acta Acustica United with Acustica, 2005,91 (6) : 1 -7.
  • 9Sahar E B, John H L H. A comparative study of tradition and newly proposed features for recognition of speech under stress. IEEE Transactions of Speech and Signal Processing, 2000, 8(4) :429 -442.
  • 10Lawrence R, Juang B H. Fundamentals of speech recognition. Prentice Hall, 1993, 321 - 389.

共引文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部