期刊文献+

交叉对数似然度和贝叶斯信息判据的说话人聚类算法 被引量:3

Speaker diarization algorithm based on CLR and BIC
下载PDF
导出
摘要 说话人分段聚类的任务是将一段语音中由同一说话人发出的语音聚合起来。文中提出了一种基于交叉对数似然度(Cross Log-likelihood Ratio,CLR)和贝叶斯信息判据(Bayesian information criterion,BIC)相结合的说话人聚类算法。交叉对数似然度用于计算语音段间的相似度;而贝叶斯判据则提供了一种比较适当的停止聚类的准则,该算法结合了两种方法的优点,在无监督说话人聚类中得到了较好的应用。实验结果表明,基于交叉对数似然度和贝叶斯判据的说话人聚类方法,比单纯利用交叉对数似然度的方法准确度高。 The task of Speaker diarization is to group together speech segments uttered by the same speaker. This paper presents an approach to speaker diarization based on a novel combination of Cross Loglikelihood Ratio (CLR) and standard Bayesian information criterion (BIC). Cross Log-likelihood Ratio provides an inter-chister distance measure, while BIC provides a proper stopping criterion for chistering. The method combines the advantage of these two methods and yields favorable performance in unsupervised speaker diarization. Experiment results show that the performance of the proposed approach based on combination of the CLR and BIC, is better than the approach only based on CLR clustering.
出处 《声学技术》 CSCD 北大核心 2007年第6期1181-1185,共5页 Technical Acoustics
基金 国家973计划(2004CB318106) 自然科学基金(10574140 60535030) 北京市科委(Z0005189040391)
关键词 说话人聚类 交叉对数似然度 贝叶斯判据 聚类 speaker diarization, CLR, BIC, clustering.
  • 相关文献

参考文献10

  • 1Ben M, Betser M, Bimbot F, Gravier G. Speaker diarization using bottom-up clustering based on parameter-derived distance between adapted GMMs[A]. Proceedings of the International Conference on Spoken Language Processing[C]. 2004.
  • 2彭煊,王炳锡.基于高斯混合模型差别度量的说话人聚类[J].计算机工程与应用,2005,41(5):99-102. 被引量:1
  • 3Gish H, Siu M, Rohlicek R. Segregation of speakers for speech recognition and speaker identification[J]. Proc. International Conference on Acoustics, Speech and Signal Processing, 1991, 2: 873-876.
  • 4王炜,吕萍,颜永红.基于假设检验的的自动说话人聚类算法[A].第八届全国人机语音通讯学术会议[C].2005.
  • 5肖述才,欧智坚,王作英.语音识别中的一种说话人聚类算法[J].中文信息学报,2005,19(4):84-88. 被引量:4
  • 6Reynolds D, Singer E, Carlson B, O'Leary G J. McLaughlin and M. Zissman, Blind clustering of speech utterances based on speaker and language characteristics[A]. Proc of the International Conference on Speech and Language Processing[C]. Sydney, Dec 1998.
  • 7CHEN S S, Gopalakrishnan P S. Speaker, Environment and channel change detection and clustring via bayesian information criterion [A]. Proceedings of DARPA Broadcast News Transcription and Understanding Workshop Landsdowne[C]. VA, 1998, 2.
  • 8徐燃 刘晓星 潘接林.一种基于距离测算和贝叶斯信息判据的音频分段算法.声学技术,2005,24:76-79.
  • 9Reynolds D, Quatieri T, Drum R. Speaker verification using adapted gaussian mixture models[J]. Digital Signal Processing, 2000, 10(1-3): 19-41.
  • 10NIST. Rich transcription spring 03 evaluation plan. http://www.nist.gov/speech/tests/rt/rt2003/spring/docs/rt03-spring-eval-plan-v4.pdf.

二级参考文献15

  • 1S Ikbal,K Weber,H Bourlard. Speaker Normalization using HMM2[C].In:IDIAP-RR 02-15 ,Proc IEEE Workshop on Neural Networks for Signal Processing,2002-09.
  • 2Padmanabhan M,Bahl L R.Speaker clustering and transformation for speaker adaptation in speech recognition systems[J].IEEE Trans on Speech and Audio Processing, 1998 ;6( 1 ) :71~77.
  • 3Naito M,Deng L,Sagisaka Y.Speaker clustering for speech recognition using the parameters characterizing vocal-tract dimensions[C].In:Proc ICASSP, 1998:981~984.
  • 4Yamada M ,Komori Y.Fast algorithm for speech recognition using speaker cluster HMM[C].In :Proc EuroSpeech, 1997:2043~2046.
  • 5T Sagayama S.Tree-structure speaker clustering for fast speaker adaptation[C].In: Proc ICASSP, 1994:245~248.
  • 6Kosaka T, Matsunaga S, Sagayama S.Speaker-independent speech recognition based on tree-structured speaker clustering[J].Computer Speech and Language, 1996; 10:55~74.
  • 7S. Chert, P. Gopalakrishnan. Speaker, environment and channel change detection and clustering via the Bayesian Information Criterion, DARPA Broadcast News Transcription and Understanding Workshop[C], Landsdowne, VA ,1998.
  • 8A. Solomonoff and A. Mielke and M. Schmidt and G. Herbert, Clustering Speakers by their Voices[C], ICASSP,Seattle, May, 1998.
  • 9R. Faltlhauser and G. Ruske,Robust Speaker Clustering in Eigenspace, In: Proc. ASRU2001[C], 2001.1252.
  • 10Masaki Naito, Li Deng, Yoshinori Sagisaka, Speaker clustering for speech recognition using vocal tract parameters[J]. Speech Communication 2003,305-315.

共引文献3

同被引文献13

引证文献3

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部