摘要
本文建立了一个基于对话语音的与文本无关的说话人确认系统 ,它和传统的与文本无关的说话人确认系统的关键不同在于 ,训练及测试语音不再只包含一个人而都是对话语音 ,因此需要分割出属于不同说话人的语音段 ,以建立说话人模型和实现最终判决。文中详细介绍了高斯混合模型 -背景模型 (GMM UBM)这种说话人确认系统的框架 ,重点讨论了基于GLR(GeneralizedLikelihoodRatio)距离测度的无监督语音分割算法。最终阐述的输出评分的规整方法即ZNORM (ZeroNormalization)和持续时间修正 ,可以使确认系统的性能提高近 10 %。
In this paper, a text-independent speaker verification system is proposed based on conversation. The key difference between this system and the conventional 1-speaker verification system is that the speech for training and testing is conversation. So speech segmentation based on speakers is applied to train the speakers'models and make the final decision. The GMM-UBM frame is introduced while an unsupervised speech segmentation algorithm based on GLR distance measure is emphasized. Then the normalization of score including ZNORM and duration penalty results in improvement of performance by 10%.
出处
《中文信息学报》
CSCD
北大核心
2004年第2期36-43,共8页
Journal of Chinese Information Processing
基金
国家自然科学基金资助 (6 0 2 72 0 39)
安徽省自然科学基金资助 (0 10 4 2 2 0 5 )