摘要
提出了基于远近距离的说话人聚类算法:首先,使用端点检测算法把语音分割成读音段,然后,采用T2公式对近距离的说话人语音段进行聚类得到语音块,最后,使用谱聚类的方法估计说话人数目,对远距离的说话人(语音块)进行聚类。实验结果表明,在近距离的说话人聚类中,使用T2公式比使用BIC和KL在语音块准确率方面分别高出2.62%和13.84%,在远距离的说话人聚类中,使用谱聚类算法基本上可以把语音中的说话人数目计算出来,当说话人数目为15时,类纯度和说话人纯度可以达到78%,说明该算法可以有效地对说话人进行聚类。
A method of speaker clustering based on far and near distance is proposed. Voice activity detector is used to segment speech into speech segments firstly, T2 is used to cluster the near distance speech segments which belongs to the same speaker, so speech chunk can be gotten, and spectral clustering method is used to estimate the number of speaker and cluster speech chunk. Experimental results shows that using T2 can improve 2.62% and 13.84% in speech chunk precise compared with BIC and KL in near distance clustering, respectively, using spec- tral clustering can calculate the number of speaker, clustering purity and speaker purity can reach 78% when the speaker number is fifteen in far distance clustering, which can mean this algorithm can cluster for the speakers ef- fectively.
出处
《科学技术与工程》
北大核心
2013年第12期3297-3300,共4页
Science Technology and Engineering
基金
国家自然科学基金(61101160)资助
关键词
说话人聚类
近距离聚类
远距离聚类
speaker clustering near distance clustering far distance clustering