
变异特征加权的异常语音说话人识别算法 被引量:5

Speaker Recognition Algorithm for Abnormal Speech Based on Abnormal Feature Weighting
摘要 常用的加权算法难以跟踪非常态语音特征的变异,为此,文中提出了一种变异特征加权的异常语音说话人识别算法.首先统计大量正常语音各阶MFCC特征的概率分布,建立正常语音特征模板;然后用测试语音特征与正常语音特征模板之间的K-L距离和欧氏距离来度量语音的变异程度,确定K-L加权因子和欧氏加权因子;最后利用加权因子对测试语音的MFCC特征进行加权,并将加权后的特征输入高斯混合模型进行异常语音说话人识别.实验结果表明,文中提出的K-L加权和欧氏加权的异常语音说话人识别算法的整体识别率分别为46.61%和42.25%,而基于各阶特征对说话人识别贡献的加权算法和不加权算法的整体识别率分别为39.68%和36.36%. As the commonly-used weighting algorithm is inefficient in tracking the abnormal feature of abnormal speech,a speaker recognition algorithm for abnormal speech is proposed based on the abnormal feature weighting.In this algorithm,first,a feature template of normal speech is established by computing the probability distribution of MFCC features of each order in a large number of normal speech samples.Then,the K-L distance and the Euclidean distance are used to measure the differences between a given test speech and the normal speech templates and to further determine the K-L and the Euclidean weighting factors.Finally,the two weighting factors are used to weight the MFCC features of the test speech,and the weighted MFCC features are input in the Gaussian mixture model for the speaker recognition with abnormal speech.Experimental results show that the global recognition rates of the speaker recognition algorithms based on the K-L weighting and the Euclidean weighting are respectively 46.61% and 42.25%,while those of the algorithms with and without the weighting of speaker recognition contribution of each order feature are respectively only 39.68% and 36.36%.
出处 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2012年第3期106-111,共6页 Journal of South China University of Technology(Natural Science Edition)
基金 国家自然科学基金资助项目(60972132 61101160) 广东省自然科学基金团队项目(9351064101000003) 广东省自然科学基金博士科研启动项目(10451064101004651) 华南理工大学中央高校基本科研业务费专项资金资助项目(2011ZM0029)
关键词 异常语音 说话人识别 变异特征加权 K-L距离 加权因子 abnormal speech speaker recognition abnormal feature weighting K-L distance weighting factor
  • 相关文献


  • 1Rashid R A,Mahalin N H,Sarijari M A,et al.Securitysystem using biometric technology design and implementa-tion of voice recognition system[C]∥Proceedings of In-ternational Conference on Computer and CommunicationEngineering.Kuala Lumpur:IEEE,2008:898-902.
  • 2杨继臣,贺前华,潘伟锵,徐益君,李艳雄.一种改进的BIC说话人改变检测算法[J].华南理工大学学报(自然科学版),2009,37(9):47-51. 被引量:5
  • 3张磊,韩纪庆,王承发.变异语音处理的研究进展[J].电子学报,2003,31(3):411-418. 被引量:3
  • 4Alpan A,Maryn Y,Kacha A,et al.Multi-band dysperio-dicity analyses of disordered connected speech[J].SpeechCommunication,2011,53(1):131-141.
  • 5Maciel C D,Pereira J C,Stewart D.Identifying healthyand pathologically affected voice signals[J].IEEE SignalProcessing Magazine,2010,27(1):120-123.
  • 6Togneri R,Pullella D.An overview of speaker identifica-tion:accuracy and robustness issues[J].Circuits andSystems Magazine,2011,11(2):23-61.
  • 7Garner Philip N.Cepstral normalisation and the signal tonoise ratio spectrum in automatic speech recognition[J].Speech Communication,2011,53(8):991-1001.
  • 8Yang Hong-wu,Liu Ya-li,Huang De-zhi.Speaker recogni-tion based on beighted Mel-cepstrum[C]∥Proceedingsof the Fourth International Conference on Computer Sci-ences and Convergence Information Technology.Seoul:IEEE,2009:200-203.
  • 9Weng Zufeng,Li Lin,Guo Donghui.Speaker recognitionusing weighted dynamic MFCC based on GMM[C]∥Proceedings of International Conference on Anti-Counter-feiting Security and Identification in Communication.Chendu:IEEE,2010:285-288.
  • 10Kullback S,Leibler R.On information and sufficiency[J].Annals of Mathematical Statistics,1951,30(3):79-86.


  • 1张家騄.超音段特征间的相互作用[J].声学学报,1993,18(4):263-271. 被引量:3
  • 2韩纪庆,张磊,王承发.心理紧张情况下的Robust语音识别方法[J].计算机科学,2000,27(9):44-46. 被引量:1
  • 3吕成国 张磊 韩纪庆 等.G-Stress和Lombard效应作用下的变异语音语谱图[J].高技术通讯增刊,2000,:223-226.
  • 4Kaiser J F.On a simple algorithm to calculate the ‘energy'' of a signal [A]..I CASSP''90 [C].USA:IEEE Press,1990.381-384.
  • 5潘胜昔 刘加 江金涛 等.基于多模式及集成判决的稳健电话语音识别算法研究[A].王承发张凯.第五届全国人机语音通讯学术会议论文集[C].,1998.154-159.
  • 6马永林.[D].哈尔滨:哈尔滨工业大学工学,20 01.
  • 7马永林 韩纪庆 张磊 等.应力影响下的变异语音分类[A]..863计划智能计算机主题学术会议论文集[C].,2001.374-378.
  • 8Margarita Kotti,Luis Gustaro. Automatic speaker segmentation using muhiple feature and distance measure:a comparison of three approaches [ C ]//Proceedings of IEEE International Conference on Multimedia and Expo. Toronto : IEEE ,2006 : 1 101-1 104.
  • 9Amit S Malegaonkar, Aladdin M Ariyaeeinia, Perasiriyan Sivakumaran. Efficient speaker change detection using adapted Gaussian mixture models [ J ]. IEEE Transactions on Audio, Speech and Language Processing, 2007, 15 (6) :1 859-1 869.
  • 10Soonil kwon, Shrikanth Narayanan. Unsupervised speaker indexing using generic models[J].IEEE Transactions on Speech and Audio Processing ,2005,13 ( 5 ) : 1004-1013.



  • 1王立媛,刘玉萍,肖青,祁金刚.胎儿心率信号的替代数据分析[J].长春理工大学学报(自然科学版),2007,30(1):72-75. 被引量:2
  • 2Dibazar A A, Park H O, Berger T W. Nonlinear dynamic modeling of impaired voice [ C]//Proceedings of 2010Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Buenos Aires:IEEE, 2010:2770-2773.
  • 3Tavares R, Brunet N, Costa S C, et al. Combining entropy measurements and cepstral analysis for pathological voice assessment [ C ]//Proceedings of 2011 ISSNIP Biosignals and Biorobotics Conference. Vitoria : IEEE, 2011 : 1 -5.
  • 4Thomas M, Gudnason J, Naylor P. Estimation of glottal closing and opening instants in voiced speech using the YAGA algorithm [J]. IEEE Transactions on Audio, Speech, and Language Processing,2012,20 ( 1 ) : 82- 91.
  • 5Arias-Londono J D, Godino-Llorente J I, Saenz-Lechon N, et al. Automatic detection of pathological voices using complexity measures, noise parameters, and Mel-Cepstral coefficients [ J ]. IEEE Transactions on Biomedical Engi- neering,2011,58 (2) :370-379.
  • 6Maciel C D, Pereira J C, Stewart D. Identifying healthy and pathologically affected voice signals [ J ]. IEEE Signal Processing Magazine, 2010,27 ( 1 ) : 120-123.
  • 7Brockmann Meike, Drinnan Michael J, Storck Claudio, et al. Reliable Jitter and Shimmer measurements in voice clinics : the relevance of vowel, gender, vocal intensity, and fundamental frequency effects in a typical clinical task [J]. Journal of Voice,201 1,25( 1 ) :44-53.
  • 8Kasuya H,Endo Y, Saliu S. Novel acoustic measurements of Jitter and Shimmer characteristics from pathologic voice [ C] //Proceedings of the Third European Conference on Speech Communication and Technology. Berlin:Anne Bon- neau, 1993 : 1973-1976.
  • 9Kasuya H, Ogawa S, Mashima K, et al. Normalized noise energy as an acoustic measure to evaluate pathologic voice [J]. Journal of the Acoustical Society of America, 1986, 80(5) :1329-1334.
  • 10FrohlichM, Michaelis D, Werner Strube H. Acoustic" brea- thiness measures" in the description of pathologic voices [ C ]// Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. Seattle : IEEE, 1998 : 937- 940.










使用帮助 返回顶部