期刊文献+

基于DS证据理论多特征融合模型的说话人分割聚类研究

下载PDF
导出
摘要 说话人分割聚类是语音处理领域一个重要的研究课题。为提高说话人分割聚类的准确性,提出一种基于DS证据理论多特征融合模型用于提取说话人嵌入特征。该方法使用2种组合特征来更高效地表征语音,用于DenseNet网络的输入,利用DS证据理论对softmax层的输出进行融合,得到说话人的嵌入特征。分别使用单一特征与组合特征输入的DenseNet网络与该模型进行实验对比分析,结果表明,基于该模型的说话人分割聚类提取目标说话人的准确性更有优势。 Speaker diarization is an important research topic in the field of speech processing.In order to improve the accuracy of speaker diarization,a Dempster-Shafer theory based multi-feature fusion model is proposed for extracting speaker embedding features.Through this method,two combined features are used to represent the speech more efficiently,which is used for the input of the DenseNet network,and the DS evidence theory is used to fuse the output of the softmax layer to get the embedded features of the speaker.The DenseNet network with single feature input and combined feature input is used to compare with the model,and the results show that the accuracy of speaker diarization and clustering based on this model is better.
出处 《科技创新与应用》 2023年第23期108-111,共4页 Technology Innovation and Application
关键词 说话人分割聚类 DS证据理论 密集卷积网络 组合特征 鲁棒性 speaker diarization Dempster-Shafer theory DenseNet combinatorial features robustness
  • 相关文献

参考文献5

二级参考文献97

  • 1付中华,张艳宁.在线无监督说话人检索中稳健的模型自举算法[J].软件学报,2007,18(3):608-616. 被引量:3
  • 2..http://www.itl.nist.gov/iad/mig/tests/rt/,.
  • 3S.E.Tranter,D.A.Reynolds.An overview of automatic speaker diarization systems[J].IEEE Tram on Audio,Speech,and Language for Processing.2006,14(5):1557-1565.
  • 4M.Kotti,V.Moschou,C.Kotropoulos.Speaker segmentation and clustering.Signal Processing 2008(88):1091-1124.
  • 5T.Stafylakis and V.Katsouros.A review of recent advances in speaker diarization with bayesian methods.Speech and Language Technologies[M].InTech pubhshing 2011:217-240.
  • 6X.Anguera,S.Bozonnet,N.Evans,C.Fredouille,G.Friedland,O.Vinyals.Speaker diarization:a review of recent research[J].IEEE Trans on Audio,Speech,and Language for Processing.2012,20(2):356-370.
  • 7J.Ramírez; J.M.G6rriz,J.C.Segura.Voice activity detection.Fundamentals and Speech Recognition System Robustness[M].In M.Grimm and K.Kroschel.Robust Speech Recognition and Understanding.2007:1-22.
  • 8D.Liu and F.Kubala,Fast speaker change detection for broadcast news transcription and indexing[C].In Proc.Eur Conf.Speech Commun Technol,1999(3):1031-1034.
  • 9Nwe,T.L,Sun,H.,Li.,H.,Rahardja,S.,Speaker diarization in meeting audio,In Proc.of ICASSP,2010:4073-4076.
  • 10V.Gupta,P.Kenny,P.Ouellet,G.Boulianne,and P.Dumouchel.Combining gaussianized/non-gaussianized features to improve speaker diarization of telephone conversations[J].IEEE Signal processing letters.2007,14(12):1040-1043.

共引文献55

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部