期刊文献+

改进的说话人聚类初始化和GMM的多说话人识别 被引量:5

Improved speaker clustering initialization and GMM multi-speaker recognition
下载PDF
导出
摘要 针对多说话人聚类线性初始化方法精度较差的问题,提出了一种改进的聚类初始化方法。该方法引入BIC对由线性初始化产生的初始类进行检测分割,有效提升了说话人初始类纯度。最后将该方法应用到高斯混合模型(GMM)多说话人识别系统。实验结果表明,所提方法使说话人平均类纯度(ACP)提高了48.51%,系统的错误识别率平均降低12.09%。 Aiming at the problem of the linear initialization method of multiple speaker clustering with poor accuracy,this paper proposed an improved method of clustering initialization.The method by introducing BIC to detect and segment for initial cluster produced by the linear initialization,and promoted the purity of speaker initial cluster effectively.Finally,applied the method to Gaussian mixture model(GMM) multi-speaker recognition system.And the experimental results show that this proposed method makes the average cluster purity(ACP) have been increased by 48.51%,and the error recognition of system have been reduced by 12.09% on average.
作者 曹洁 余丽珍
出处 《计算机应用研究》 CSCD 北大核心 2012年第2期590-593,共4页 Application Research of Computers
基金 甘肃省财政厅资助项目(0914ZTB148) 甘肃省自然科学基金资助项目(1014ZSB064)
关键词 多说话人识别 改进的聚类初始化 高斯混合模型 平均类纯度 multi-speaker recognition improved clustering initialization Gaussian mixture model average cluster purity
  • 相关文献

参考文献11

  • 1邓菁.电话信道下多说话人识别研究[D].北京:清华大学,2007.
  • 2WOOTERS C, HUIJBREGTS M. The ICSI RT07s speaker diarization system[ J]. Multimodal Technologies for Perception of Humans, 2008,4625:509-519.
  • 3GARAU G,BOURLARD H. Using audio and visual cues for speaker diarisation initialization [ C ]//Proc of International Conference on Acoustics, Speech and Signal Processing. [ S. 1. ] :IEEE Signal Pro- cessin~ Society,2010:4942-4945.
  • 4HUNG H,HUANG Yan, FRIEDLAND G, et al. Estimating the dom- inant person in multi-party conversations using speaker diarization strategies [ C ]//Proc of International Conference on Acoustics, Speech and Signal Processing. [ S. 1. ] : IEEE Press,2008:2197-2200.
  • 5赵晖,顾亚强,唐朝京.基于乘积HMM的双模态语音识别方法[J].计算机工程,2010,36(8):7-9. 被引量:8
  • 6FRIEDLAND G, HUNG H, YEO C. Multi-modal speaker diarization of real-world meetings using compressed-domain video features[ C ]/! Proc of International Conference on Audio, Speech and Signal Proces- sing. [ S. 1. ] :IEEE Press,2009:4069-4072.
  • 7HUNG H, FRIEDLAND G. Towards audio-visual on-line diarization of participants in group meetings[ C ]//Proc of Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications. Mar- seille : European Conference on Computer Vision,2008 : 1-12.
  • 8HUNG H, HUANG Yan, FRIEDLAND G, et al. Estimating domi- nance in multi-party meetings using speaker diarization [ J ]. IEEE Yrans on Audio, Speech and Language Processing, 2010, 19 (4) :84?-860.
  • 9NOULAS A, ENGLEBIENNE G, KROSE B. Multi-modal speaker di- arisation[ J]. IEEE Trans on Pattern Analysis and Machine In- telligence,2011,34( 1 ) :79-93.
  • 10GARAU G, DIELMANN A, BOURLARD H. Audio-visual synchroni- sation for speaker diarisation [ C ]//Proc of International Conference on Speech and Language Processing. Makuhari, Chiba: [ s. n. ] , 2010:2654-2657.

二级参考文献7

  • 1Kumatani K,Nakamura S,Shikano K.An Adaptive Integration Based on Product HMM for Audio-visual Speech Recognition[C]// Proceedings of IEEE ICME'01.Tokyo,Japan:[s.n.],2001:1020-1023.
  • 2Lee J S,Park C H.Robust Audio-visual Speech Recognition Based on Late Integration[J].IEEE Transactions on Multimedia,2008,10(5):767-779.
  • 3Dupont S,Luettin J.Audio-visual Speech Modeling for Continuous Speech Recognition[J].IEEE Transactions on Multimedia,2000,2(3):141-151.
  • 4Zhao Hui,Tang Chaojing,Yu Tao.Fast Thresholding Segmentation for Image with High Noise[C]//Proceedings of ICIA'08.Zhangjiajie,China:[s.n.],2008:290-295.
  • 5Rabiner L R.A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition[J].Proceedings of the IEEE,1989,77(2):257-286.
  • 6Bregler C,Omohundro S M.Nonlinear Manifold Learning for Visual Speech Recognition[C]//Proc.of IEEE Int'l Conf.on Computer Vision.Piscataway,NJ,USA:[s.n.],1995:494-499.
  • 7谢磊,蒋冬梅,Ilse Ravyse,赵荣椿,Hichem Sahli,Werner Verhelst,Jan Cornelis.双模型语音识别中的听视觉合成和模型同步异步性实验研究[J].西北工业大学学报,2004,22(2):171-175. 被引量:3

共引文献10

同被引文献44

  • 1WOOTERS C,HUUBREGTS M. The ICSI RT07s speaker diarization system [ C ] //Proc of Multimodal Technologies for Perception of Humans, 2008 :509-519.
  • 2HUNG H,HUANG Yan,FRIEDLAND G ’ et al. Estimating the dominant person in multi-party conversations using speaker diarization strategies [ C ] //Proc of International Conference on Acoustics, Speech and Signal Processing. 2008:2197-2200.
  • 3HUNG H,HUANG Yan, FRIEDLAND,et al. Estimating dominance in multi-party meetings using speaker diarization [ J ] . IEEE Trans on Audio, Speech and Language Processing, 2010,19 (4) : 847-860.
  • 4NOULAS A K,ENGLEBINNE G,KROSE B J A. Multi-modal speaker diarisation[J]. IEEE Trans on Pattern Analysis and Machine Intelligence ,2012,34(1) :79-93.
  • 5HUNG H,JAYAGOPI D,YEO C,et al. Using audio and video features to classify the most dominant person in a group meeting[ C ] //Proc of the 15th International Conference on Multimedia. New York: ACMPress,2007:835-838.
  • 6ANGUERA X,WOOTERS G,HERNANDO J. Friends and enemies;a novel Initialization for speaker diarization [ C ] //Proc of the 9th International Conference on Spoken Language. 2006 : 689-692.
  • 7KOH E C,SUN Han-wu,NWE T L,et al. Speaker diarization using direction of arrival estimate and acoustic feature information[ C]//Proc of Multimodal Technologies for Perception of Humans. Berlin : Springer-Verlag 2007:484-496.
  • 8LUQUE J( SEGURA C, HERNANDO. Clustering initialization based on spatial information for speaker diarization of meetings [ C ] //Proc of the 9th Annual Conference of the International Speech Communication Association. 2008 :383-386.
  • 9GARAU G,BA S,BOURLARD H,e( al. Investigating the use of visual focus of attention for audio-visual speaker diarisation [ C ] //Proc of the 17th ACM International Conference on Multimedia. New York : ACM Press,2009:681-684.
  • 10ZOBL M,WALLHOFF F, RIGOLL G. Action recognition in meeting scenarios using global motion features[ C]//Proc of the 4th IEEE International Workshop on Performance Evaluation of Tracking and Surveillance. 2003:32-36.

引证文献5

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部