期刊文献+

基于特征均值距离的短语音段说话人聚类算法 被引量:9

Feature Mean Distance Based Speaker Clustering for Short Speech Segments
下载PDF
导出
摘要 该文提出一种基于特征均值距离的短语音段说话人聚类算法。首先,定义特征均值距离用来在特征层而不是模型层刻画两个类之间的相似度;然后,迭代合并特征均值距离最小的两个类,直到任意两类之间的特征均值距离的最小值大于一个自适应门限为止。采用取自两个语音数据库的短于3 s的语音段进行实验测试,结果表明:与基于AHC+BIC的算法相比,F度量值平均提高了5%,运算速度约为以前算法的4.68倍。 An algorithm of speaker clustering is proposed based on Feature Mean Distance(FMD) for short speech segments.First,a distance measure,i.e.FMD,is introduced to represent the similarities between two clusters on the level of feature instead of the level of model.Then,two clusters with the minimum of FMDs are iteratively merged until the minimum of FMDs is larger than an adaptive threshold.Experimental results show average 5% improvements in F measure are obtained in comparison with the AHC+BIC based algorithm.In addition,the proposed algorithm is 4.68 times faster than the AHC+BIC based algorithm.
出处 《电子与信息学报》 EI CSCD 北大核心 2012年第6期1404-1407,共4页 Journal of Electronics & Information Technology
基金 国家自然科学基金(61101160 60972132) 中央高校基本科研业务费专项基金(2011ZM0029) 广东省自然科学基金博士启动项目(10451064101004651)资助课题
关键词 语音信号处理 说话人聚类 特征均值距离 短语音段 Speech signal processing Speaker clustering Feature Mean Distance(FMD) Short speech segments
  • 相关文献

参考文献9

  • 1Ostendorf M, Favre B, Grishman R, et al.. Speech segmentation and spoken document processing[J]. IEEE Signal Processing Magazine, 2008, 25(3): 59-69.
  • 2Bouamrane M M and Luz S. Meeting browsing state-of-the- art review[J]. Multimedia Systems, 2007, 12(4-5): 439-457.
  • 3Tur G, Stolcke A, Voss L, et al.. The CALO meeting assistant system[J]. IEEE Transactions on Audio, Speech and Language Processing, 2010, 18(6): 1601-1611.
  • 4Margarita K, Vassiliki M, and Constantine K. Speaker segmentation and clustering[J]. Signal Processing, 2008, 88(5) 1091-1124.
  • 5Xavier A and Jean-Francois B. Fast speaker diarization based on binary keys[C]. International Conference on Acoustics, Speech and Signal Processing, IEEE, Prague, 2011: 4428-4431.
  • 6Imseng D and Friedland G. Tuning-robust initialization methods for speaker diarization[J]. IEEE Transactions on Audio, Speech and Language Processing, 2010, 18(8): 2028-2037.
  • 7Valente F, Motlicek P, and Vijayasenan D. Variational Bayesian speaker diarization of meeting recordings[C]. International Conference on Acoustics, Speech and Signal Processing, IEEE, Dallas, 2010: 4954-4957.
  • 8Han K J, Kim S, and Narayanan S S. Robust speaker clustering strategies to data source variation for improved speaker diarization[C]. IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop, Kyoto, 2007: 262-267.
  • 9Li Y X and He Q H. Detecting laughter in spontaneous speech by constructing laughter bouts[J]. International Journal of Speech Technology, 2011, 14(3): 211-225.

同被引文献58

  • 1徐利敏,唐振民,何可可,钱博.说话人识别中基于聚类特征的矢量量化技术[J].计算机工程与应用,2007,43(27):196-198. 被引量:2
  • 2Ostendorf M,Favre B,Grishman R,et al.Speech segmentation and spoken document processing[J].IEEE Signal Processing Magazine,2008,25(3):59-69.
  • 3Tur G,Stolcke A,Voss L,et al.The CALO meeting assistant system[J].IEEE Transactions on Audio,Speech and Language Processing,2010,18(6):1601-1611.
  • 4Miro X A,Bozonnet S,Evans N,et al.Speaker diarization:a review of recent research[J].IEEE Transactions on Audio,Speech and Language Processing,2012,20(2):356-370.
  • 5Valente F,Motlicek P,Vijayasenan D.Variational Bayesian speaker diarization of meeting recordings[C]//International Conference on Acoustics,Speech and Signal Processing,IEEE,Dallas,2010:4954-4957.
  • 6Ning Huazhong,Liu Ming,Tang Hao,et al.A spectral clustering approach to speaker diarization[C]//Proc of the Int Conf on Spoken Language Processing,Pittsburgh,2006:2178-2181.
  • 7Li Zhenguo,Liu Jianzhuang,Chen Shifeng,et al.Noise robust spectral clustering[C]//IEEE 11th International Conference on Digital Object Identifier,2007:1-8.
  • 8李艳雄,徐鑫,贺前华,等.基于说话人分割与聚类的多说话人语速估计方法:中国,201110403577.3[P].2012-07-04.
  • 9Li Yanxiong,He Qianhua,Kwong S,et al.Characteristicsbased effective applause detection for conference speech[J].Signal Processing,2009,89(8):1625-1633.
  • 10Salamin H,Vinciarelli A.Automatic role recognition in multiparty conversations:an approach based on turn or- ganization,prosody,and conditional random fields [J]. IEEE Transactions on Multimedia, 2012,14 (2) : 338-345.

引证文献9

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部