期刊文献+

采用长度规整MAP的说话人分割聚类 被引量:1

Speaker Diarization Based on Length Normalization MAP
下载PDF
导出
摘要 本文首次提出了长度规整的最大后验估计(MAP)方法,并将其应用到说话人分割聚类中的交叉似然比(CLR)和T-Test这两种度量距离上。传统的MAP方法需要在通用背景模型(UBM)基础上进行统计量的计算,进而对模型参数进行自适应偏移,因此偏移的程度与语音片段的长度正相关。当在度量两个长度不相同的语音片段的相似性时,传统的MAP方法会使得说话人模型刻画不准确,从而影响距离度量。本文在MAP过程中,根据语音的长度对相关因子进行规整,然后再进行模型参数的调整,从而使得模型参数与语音长度无关,更能体现说话人的身份信息。在中文多人电视访谈节目数据的分割聚类评测任务上,采用长度规整的MAP方法相对于传统方法都有明显提升,在CLR度量准则下分割聚类错误率相对下降了3.5!,在T-Test度量准则下分割聚类错误率相对下降了10.7!。 We proposed a length normalization maximum a posterior( MAP) algorithm,which can be applied to Cross Likelihood Ratio( CLR) and T-Test distance metric methods in speaker diarization. Since the shift from the UBM in adaptation procedure is based on statistics calculated against the Universal Background Model( UBM),the model parameters obtained from the classical MAP method have a positive correlation with the length of the speech segment. When measuring the similarity of two segments with different length,the classical MAP method will bring about speaker model's variability,which would affect the distance metric in speaker diarization. We proposed to apply length normalization to the relevant factor before adapting the parameters of the speaker model. Hence,the model parameters are irrelevant to the length of the speech,and it can reflect the speaker's identity better. In the speaker diarization task of a Chinese multi-speaker TV talk show,Compared with the classical MAP,the proposed normalized MAP method can reduce the diarization error rate by3. 5! #$ %'()* +,-.%'/#$0 1'%23 4$3 56 78. 9! in the T-Test clustering method.
作者 朱唯鑫 郭武
出处 《信号处理》 CSCD 北大核心 2016年第7期859-865,共7页 Journal of Signal Processing
基金 安徽省自然科学基金资助项目(1408085MKL78)资助
关键词 说话人分割聚类 最大后验估计 长度规整 交叉似然比 T检验距离 speaker diarization maximum a posterior length normalization cross likelihood ratio T-Test
  • 相关文献

参考文献14

  • 1Miro X A, Bozonnet S, Evans N, et al. Speaker diarization: A review of recent research[ J]. IEEE Transactions on Au- dio, Speech, and Language Processing, 2012, 20 ( 2 ) : 356-370.
  • 2Tranter S E,Reynolds D. An overview of automatic speaker diarization systems [ J ]. IEEE Transactions on Audio, Speech,and language Processing,2006,14(5):1557-1565.
  • 3Desplanques B, Demuynek K, Martens J P. Factor Anal- ysis for Speaker Segmentation and Improved Speaker Dia- rization [ C ] //Proceedings of the 16th Annual Conference of the International Speech Communication Association ( INTERSPEECH). Dresden, Germany ,2015:3081-3085.
  • 4Cheng S S,Wang H bl,Fu H C. BIC-based speaker segmen- tation using divide-and-conquer strategies with application to speaker diarization [ J ]. IEEE Transactions on Audio, Speech,and language Processing,2010,18(1) :141-157.
  • 5Delgado H, Anguera X, Fredouille C, et al. Fast single-and cross-show speaker diarization using binary key speaker modeling[ J ]. IEEE/ACM Transactions on Audio, Speech, and Language Processing,2015,23(12) :2286-2297.
  • 6Nguyen T H, Chng E S, Li H. T-test distance and clus- tering criterion for speaker diarization [ C ] // Proceedings of the 9th Annual Conference of the International Speech Communication Association(INTERSPEECH). Brisbane, Australia, 2008:36-39.
  • 7Dehak N, Kenny P, Dehak R, et al. Front-end factor anal- ysis for speaker verification [ J ]. IEEE Transactions on Audio, Speech, and Language Processing, 2011,19 ( 4 ) : 788-798.
  • 8Madikeri S, Himawan I, Motlicek P, et al. Integrating Online I-vector extractor with Information Bottleneck based Speaker Diarization system [ C ] //Proceedings of the 16th Annual Conference of the International Speech Communication Association (INTERSPEECH). Dresden, Germany,2015:3105-3109.
  • 9Reynolds D A, Quatieri T F, Dunn R B. Speaker verifi- cation using adapted Ganssian mixture models[ J]. Digit- al signal processing,2000, 10(1): 19-41.
  • 10Zhu Q, Soraghan J J. LBP based recursive averaging for babble noise reduction applied to automatic speech recog- nition[ C]/JProceedings of the 22th European IEEE Sig- nal Processing Conference (EUSIPCO). Lisbon, Portu- gal,2014 : 1267-1271.

同被引文献4

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部