基于修正MFCC的耳语说话人识别方法

Speaker Identification in Whispered Speech Based on Modified-MFCC

下载PDF

导出

摘要在说话人识别中,美尔倒谱系数MFCC(Mel-Frequency Cepstral Coefficients)是一种常用的特征,但是这种通用的特征在耳语音的说话人识别上并不太理想。MFCC的三角滤波器组在Mel尺度上是均匀分布的,但是耳语音不同于正常音的发声,通过改变这种均匀分布的格局来改善耳语音说话人识别率,将全频域分成不同频段,分别调整各频段内滤波器的疏密程度,再将各频段的滤波器组合成新的滤波器组。修正后的滤波器模型在文本无关的耳语音说话人识别中相比原模型识别效果有所提高。 MFCC （Mel-Frequency Cepstral Coefficients） is a normally used feature in speaker recognition system. But such a common feature does not work well on whispered speech. The original MFCC was a bunch of triangular filter uniformly distributed in Mel dimension. This paper presents a new research method, by changing this uniform dis- tribution to improve ASR recognition rate on whispered voice, since its differently pronounced way other than the nor- mal voice. Experiments were done to analyze the effect of different recognition rate caused by different number of filters added to each single frequency region, and then the result of experiment was used to select a proper number of filters to be added to a specified frequency. Then, the combination of all filters will be the final model. The model is designed as an adaption to the voice source, and it shows a good performance in text-independent speaker recognition.

作者丁国梁

机构地区苏州大学电子信息学院

出处《苏州大学学报（工科版）》 CAS 2009年第4期59-64,共6页 Journal of Soochow University Engineering Science Edition (Bimonthly)

关键词说话人识别耳语音 MFCC 三角滤波器组 speaker identification whispered speech MFCC triangle filters

分类号 TN912.341 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献5

1林玮,杨莉莉,徐柏龄.基于修正MFCC参数汉语耳语音的话者识别[J].南京大学学报（自然科学版）,2006,42(1):54-62. 被引量：23
2RL特拉斯克.语音学和音系学字典[M].《语音学和音系学字典》编译组,译.北京:语文出版社,2000:286.
3Jovicic S T. Formant feature differences between whispered and voiced sustained vowels[J].Acustica-acta united with Acustica. 1998,84 (4) :739 - 743.
4Matsuda M,Kasuya H. Acoustic nature of the whisper[ J]. Eurospeech, 1999 ( 1 ) :137 - 140.
5Sahar E B, John H L H. A comparative study of tradition and newly proposed features for recognition of speech under stress [ J ]. IEEE Transactions of Speech and Signal Processing,2000,8(4) :429 -442.

二级参考文献12

1栗学丽,丁慧,徐柏龄.基于熵函数的耳语音声韵分割法[J].声学学报,2005,30(1):69-75. 被引量：34
2杨莉莉,李燕,徐柏龄.汉语耳语音库的建立与听觉实验研究[J].南京大学学报（自然科学版）,2005,41(3):311-317. 被引量：13
3Schwartz M F, Rine M F. Identification of speaker sex from isolated, whispered vowels. Journal of Acoustical Society of America, 1968, 44 ( 6 ) :1 736-1737.
4Yu H. The whisper is not helpful for treating hoarseness and recovering voice. Journal of the Central University for Nationalities, 1996,5 ( 2 ) :163 - 166.
5Itoh T, Takeda K,Itakura F. Acoustic analysis and recognition of whispered speech. Proceedings of ICASSP. Orlando, Florida, USA, 2002,389 -392.
6Morris R W, Clements A M. Reconstruction of speech from whispers. Medical Engineering and Physics, 2002,24 (8) : 515 - 520.
7Morris R W. Enhancement and recognition of whispered speech. PhD Thesis, Georgia Institute of Technology, 2002.
8Li X L, Xu B L. Formant comparison between Mandarin whispered and voiced vowels. Acta Acustica United with Acustica, 2005,91 (6) : 1 -7.
9Sahar E B, John H L H. A comparative study of tradition and newly proposed features for recognition of speech under stress. IEEE Transactions of Speech and Signal Processing, 2000, 8(4) :429 -442.
10Lawrence R, Juang B H. Fundamentals of speech recognition. Prentice Hall, 1993, 321 - 389.

共引文献22

1荣薇,陶智,顾济华,赵鹤鸣.基于改进LPCC和MFCC的汉语耳语音识别[J].计算机工程与应用,2007,43(30):213-216. 被引量：17
2刘辉,杨俊安,许学忠.基于MFCC参数和HMM的低空目标声识别方法研究[J].弹箭与制导学报,2007,27(5):217-219. 被引量：20
3王蓁蓁,邢汉承,张志政,倪庆剑.模拟人类发散思维的测度值马尔可夫理论模型[J].南京大学学报（自然科学版）,2008,44(2):148-156. 被引量：1
4荣薇,陶智,顾济华,赵鹤鸣.基于概率神经网络的汉语耳语音识别系统[J].计算机工程与应用,2008,44(17):148-150. 被引量：3
5赵艳,赵力,邹采荣.耳语音的语音处理研究综述[J].声学技术,2008,27(4):562-569. 被引量：4
6陆伟,戴蓓蒨,李辉,刘青松.MFCC中的基音频率信息对说话人识别系统性能的影响[J].中国科学技术大学学报,2009,39(8):859-863. 被引量：11
7刘亚丽,杨鸿武,黄德智.基于加权Mel倒谱系数的说话人识别[J].计算机应用与软件,2009,26(9):24-27. 被引量：3
8金赟,赵艳,黄程韦,赵力.耳语音情感数据库的设计与建立[J].声学技术,2010,29(1):63-68. 被引量：8
9王敏,赵鹤鸣.基于多带解调分析和瞬时频率估计的耳语音话者识别[J].声学学报,2010,35(4):471-476. 被引量：12
10董桂官,沈勇.基于耳语频谱比较的话者识别方法[J].电声技术,2011,35(4):51-52. 被引量：1

1傅国霞.2—3MHz可变中频滤波器及其疏密线圈的设计与工艺问题[J].通信与广播电视,1990(1):82-86.
2王萌,王福龙.基于端点检测和高斯滤波器组的MFCC说话人识别[J].计算机系统应用,2016,25(10):218-224. 被引量：3
3何朝霞,潘平.基于听觉模型的说话人语音特征提取[J].微型机与应用,2012,31(1):37-39. 被引量：2
4张万里,刘桥.Mel频率倒谱系数提取及其在声纹识别中的作用[J].贵州大学学报（自然科学版）,2005,22(2):207-210. 被引量：20
5叶迟凡.晶体管电流放大系数β与集电极电流Ic的关系[J].怀化学院学报,1987,0(5):64-67. 被引量：1
6冬天来了？[J].世界广播电视,2008,22(11):8-8.
7品声品味极致之选[J].互联网天地,2006(5):71-71.
8董治强,刘琚,邹欣,杜军.基于ICA的语音信号表征和特征提取方法[J].山东大学学报（工学版）,2010,40(4):19-22. 被引量：3
9杨彦,王浩,赵力.基于改进混合蛙跳算法及SVM的耳语情感语音识别方法的DSP实现[J].电子器件,2012,35(6):699-703.
10张建康,穆晓敏,陈恩庆,杨守义.OFDM系统基于导频的信道估计算法分析[J].通信技术,2009,42(8):91-94. 被引量：10

苏州大学学报（工科版）

2009年第4期

浏览历史

内容加载中请稍等...

基于修正MFCC的耳语说话人识别方法

参考文献5

二级参考文献12

共引文献22

相关作者

相关机构

相关主题

浏览历史