改进的基于决策树的说话人在线聚类被引量：1

Improved online speaker clustering based on decision tree

下载PDF

导出

摘要针对采用传统的在线聚类方法时后续判决错误率较高的缺点,提出了一种改进的基于决策树的在线说话人聚类算法。通过构建一个决策树,增加判决分支,对语音段进行判决聚类,从而有效降低前期错误判决对后续聚类的影响。为了进一步提高算法效率,缩短运算时间,还给出了一种决策树剪枝方法,减少了不合理的判决分支。通过对广播新闻语料进行的说话人聚类实验表明,相比传统的层次聚类算法,新算法的平均类纯度和说话人纯度分别提高了0.9%和1.1%,计算时间减少了57%。实验结果还表明,相比手工标注说话人信息,将该算法的聚类结果应用于说话人自适应可降低系统的误识率。 Speaker clustering is a key component in many speech processing applications.To solve the problem of error propagating in the posterior clustering caused by the traditional online clustering,an improved online speaker clustering algorithm based on a decision tree is proposed.Unlike typical online clustering approaches,the proposed method constructs a decision tree to increase branches and to distinguish an audio segment clustering to reduce effectively the effect of error distinguishing on the posterior clustering.To shorten the operation time,a pruning strategy for candidate-elimination is also presented.Experiments indicate that the algorithm achieves good performance on both precision and speed.By using this method,the average speaker purity and the average cluster purity have improved by 0.9% and 1.1% respectively,and the time consuming is reduced by 57%.Experiments also show that this method is effective for improving the performance of the unsupervised adaptation as compared with the true speaker-condition.

作者张素敏苏东林王炜

机构地区北京航空航天大学电子信息工程学院中国电子科技集团公司第空军装备研究院

出处《光学精密工程》 EI CAS CSCD 北大核心 2010年第1期227-233,共7页 Optics and Precision Engineering

基金国家863高技术研究发展计划资助项目(No.2006AA701418)

关键词说话人聚类在线聚类决策树剪枝算法 speaker clustering online clustering decision tree pruning strategy

分类号 TP391.9 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献12

1PADMANABHAN M, BAHL L, PICHENY M. Speaker clustering and transformation for speaker adaptation in large-vocabulary speech recognition systems [C]. ICASSP, 1996:701-704.
2陈景东,姚磊,黄泰翼.几种高鲁棒性通道及说话人自适应语音识别算法研究[J].声学学报,1998,23(6):537-544. 被引量：9
3吕萍,颜永红.基于回归分析的语音识别快速自适应算法[J].声学学报,2005,30(3):222-228. 被引量：4
4CHEN S,GOPAI.AKRISHNAN P. Speaker, environment and channel change detection and clustering via the bayesian information criterion[C]. Proc. of BNTUW- 98, Lansdowne, 1998 : 127-132.
5JIN H, KUBALA F, SCHWARTZ R. Automatic speaker clustering[C]. Proc. of DARPA. Speech Recognition Workshop, 1997 : 108-111.
6KUBAI.A F, COLBATH S, LIU D, et al.. Inte grated technologies for indexing spoken language [J]. Commu. ACM, 2000,43(2):48-56.
7王炜,吕萍,颜永红.一种改进的基于层次聚类的说话人自动聚类算法[J].声学学报,2008,33(1):9-14. 被引量：4
8LIU D, KUBALA F. Online speaker clustering [C]. ICASSP, 2004:333-336.
9WANG W, LV P, ZHAO QW, et al.. A decision-tree-based online speaker clustering [ C]. LNCS, 2007: 555-562.
10DUDA R, HART P,STORK D. Pattern Classification[M]. 2nd ed. John Wiley & Sons, Inc. , 2001.

二级参考文献28

1吕萍,颜永红.基于回归分析的语音识别快速自适应算法[J].声学学报,2005,30(3):222-228. 被引量：4
2徐向华,朱杰,郭强.决策树结构对说话人自适应影响的研究[J].声学学报,2006,31(1):42-47. 被引量：3
3Liu F H，Proc IEEE Int Conf Acoust Speech Signal Processing，1994年，61页
4程云鹏，矩阵论，1989年
5Dong Yu，Proc EUROSPEECH’95，477页
6张希军，软件学报，1996年，863专刊
7S. Chert, P. Gopalakrishnan. Speaker, environment and channel change detection and clustering via the Bayesian Information Criterion, DARPA Broadcast News Transcription and Understanding Workshop[C], Landsdowne, VA ,1998.
8A. Solomonoff and A. Mielke and M. Schmidt and G. Herbert, Clustering Speakers by their Voices[C], ICASSP,Seattle, May, 1998.
9R. Faltlhauser and G. Ruske,Robust Speaker Clustering in Eigenspace, In: Proc. ASRU2001[C], 2001.1252.
10Masaki Naito, Li Deng, Yoshinori Sagisaka, Speaker clustering for speech recognition using vocal tract parameters[J]. Speech Communication 2003,305-315.

共引文献16

1吕萍,颜永红.基于回归分析的语音识别快速自适应算法[J].声学学报,2005,30(3):222-228. 被引量：4
2吕成国,韩纪庆,王承发.动态时间规正与差别子空间相结合的变异语音识别方法[J].声学学报,2005,30(3):229-234. 被引量：2
3陈伟红.背景噪声下的语音识别技术研究[J].现代电子技术,2006,29(14):44-45. 被引量：1
4邓菁,郑方,刘建,吴文虎.Mel子带谱质心和高斯混合相关性在鲁棒话者识别中的应用[J].声学学报,2006,31(5):471-475. 被引量：3
5董滨,赵庆卫,颜永红.基于共振峰模式的汉语普通话中韵母发音水平客观测试方法的研究[J].声学学报,2007,32(2):122-128. 被引量：16
6张捍东,李金炜.基于性别识别的分类CHMM语音识别[J].计算机工程与应用,2007,43(21):187-189. 被引量：4
7王炜,吕萍,颜永红.一种改进的基于层次聚类的说话人自动聚类算法[J].声学学报,2008,33(1):9-14. 被引量：4
8刘倓倓,潘接林,索洪斌,颜永红.交叉对数似然度和贝叶斯信息判据的说话人聚类算法[J].声学技术,2007,26(6):1181-1185. 被引量：3
9龙艳花,郭武,戴礼荣.采用支持向量机的说话者确认中的样本平衡[J].中文信息学报,2008,22(3):99-104. 被引量：1
10郑铁然,韩纪庆.基于音节Lattice的汉语语音检索技术及其索引去冗余方法[J].声学学报,2008,33(6):526-533. 被引量：7

同被引文献13

1Takiguchi T,Nakamura S,Shikano K.HMM-separation-based speech recognition for a distant moving speaker[J].IEEE Transactions on Speech and Audio Processing,2001,9(2):127-140.
2Johnson M,Sinha P.A compact model for speakeradaptive training[J].Powder Technology,2013,237(3):506-513.
3Kinnunen T,Li H.An overview of text-independent speaker recognition:from features to supervectors[J].Speech Communication,2010,52(1):12-40.
4Kasuriya S,Wutiwiwatchai C,Achariyakulporn V,et al.Comparative study of continuous hidden Markov models(CHMM)and artificial neural network(ANN)on speaker identification system[J].International Journal of Uncertainty,Fuzziness and Knowledge-Based Systems,2001,9(6):673-683.
5Campbell W M,Sturim D E,Reynolds D A.Support vector machines using GMM supervectors for speaker verification[J].Signal Processing Letters,2006,13(5):308-311.
6Munteanu D P,Toma S A.Automatic speaker verification experiments using HMM[C]∥8th International Conference on Communications,Bucharest,Romanian,2010:107-110.
7Badran E F M F,Selim H.Speaker recognition using artificial neural networks based on vowel phonemes[C]∥5th International Conference on Signal Processing,Beijing,China,2000:796-802.
8Ding I J,Yen C T.Enhancing GMM speaker identification by incorporating SVM speaker verification for intelligent web-based speech applications[J].Multimedia Tools and Applications,2015,74(14):5131-5140.
9Sen N,Patil H A,Mandal S K D,et al.Importance of Utterance Partitioning in SVM Classifier with GMM Supervectors for Text-Independent Speaker Verification[M].Heidelberg:Springer International Publishing,2013:780-789.
10Neff M,Kipp M,Albrecht I,et al.Gesture modeling and animation based on a probabilistic re-creation of speaker style[J].Acm Transactions on Graphics,2008,27(1):329-339.

引证文献1

1申铉京,翟玉杰,卢禹彤,王玉,陈海鹏.基于信道补偿的说话人识别算法[J].吉林大学学报（工学版）,2016,46(3):870-875. 被引量：3

二级引证文献3

1潘荔霞,徐文彬,李世宝,杨喜鹏.基于声纹识别的研讨型智慧教室构建[J].实验技术与管理,2018,35(7):245-250. 被引量：5
2罗旭东,文晓浩.基于人工智能及大数据的学生行为分析模型[J].教育观察,2019,8(17):11-13. 被引量：2
3罗家诚.基于改进信道补偿的I-vector说话人识别[J].电子设计工程,2021,29(20):96-100. 被引量：1

1王炜,吕萍,颜永红.一种改进的基于层次聚类的说话人自动聚类算法[J].声学学报,2008,33(1):9-14. 被引量：4
2肖述才,欧智坚,王作英.语音识别中的一种说话人聚类算法[J].中文信息学报,2005,19(4):84-88. 被引量：4
3曹洁,余丽珍.改进的说话人聚类初始化和GMM的多说话人识别[J].计算机应用研究,2012,29(2):590-593. 被引量：6
4吴伟,李艳雄,王梓里,陈祝允.基于语速差异的新闻发布会中首要说话人检测[J].计算机工程与应用,2015,51(4):222-225.
5刘晋胜.采用熵相关性优化分离性的SVM说话人识别[J].计算机工程与设计,2011,32(8):2845-2848.
6冯骋,库天锡,杨卫星,李雪蒙,谭小琼,梁超.基于跨模态的无监督影视剧说话人识别[J].计算机应用与软件,2016,33(5):132-135.
7吴奎,宋彦,戴礼荣.基于因子分析建模的电话语音说话人聚类[J].模式识别与人工智能,2013,26(1):1-5. 被引量：1
8丰洪才,卢正鼎.基于置信度的无监督说话人自适应语音识别[J].计算机工程与科学,2005,27(9):93-96. 被引量：1
9陈玥同,刘学亮.结合两种距离测度的说话人聚类算法[J].小型微型计算机系统,2015,36(10):2369-2373. 被引量：1
10王磊,杜利民,王劲林.基于音频的电视新闻节目的主题检索和聚类[J].电子与信息学报,2007,29(10):2498-2503.

光学精密工程

2010年第1期

浏览历史

内容加载中请稍等...

改进的基于决策树的说话人在线聚类被引量：1

参考文献12

二级参考文献28

共引文献16

同被引文献13

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

改进的基于决策树的说话人在线聚类 被引量：1

参考文献12

二级参考文献28

共引文献16

同被引文献13

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

改进的基于决策树的说话人在线聚类被引量：1