摘要
说话人分段聚类的任务是将一段语音中由同一说话人发出的语音聚合起来。文中提出了一种基于交叉对数似然度(Cross Log-likelihood Ratio,CLR)和贝叶斯信息判据(Bayesian information criterion,BIC)相结合的说话人聚类算法。交叉对数似然度用于计算语音段间的相似度;而贝叶斯判据则提供了一种比较适当的停止聚类的准则,该算法结合了两种方法的优点,在无监督说话人聚类中得到了较好的应用。实验结果表明,基于交叉对数似然度和贝叶斯判据的说话人聚类方法,比单纯利用交叉对数似然度的方法准确度高。
The task of Speaker diarization is to group together speech segments uttered by the same speaker. This paper presents an approach to speaker diarization based on a novel combination of Cross Loglikelihood Ratio (CLR) and standard Bayesian information criterion (BIC). Cross Log-likelihood Ratio provides an inter-chister distance measure, while BIC provides a proper stopping criterion for chistering. The method combines the advantage of these two methods and yields favorable performance in unsupervised speaker diarization. Experiment results show that the performance of the proposed approach based on combination of the CLR and BIC, is better than the approach only based on CLR clustering.
出处
《声学技术》
CSCD
北大核心
2007年第6期1181-1185,共5页
Technical Acoustics
基金
国家973计划(2004CB318106)
自然科学基金(10574140
60535030)
北京市科委(Z0005189040391)