快速口音自适应的动态说话人选择性训练被引量：1

Dynamic speaker selected training for rapid speaker adaptation

导出

摘要为解决语音识别系统实用中的说话人口音快速自适应问题,提出了一种动态说话人选择性训练方法。基于说话人选择性训练方法,采用基于Gauss混合模型似然分数计算的置信测度选择训练用说话人,改变训练用说话人的绝对数目选取方式,提高了选取的效能并拓展了选取标准的推广性。根据各个训练用说话人同被适应说话人的不同似然程度,加权地合成动态说话人选择性训练的语音模型,提高了自适应训练的效果。实验表明:该方法使识别率从80.16%提高到84.12%,相对误识率降低了19.96%,在实用中提高了基线系统的识别性能。 Practical speech recognition systems need rapid speaker adaptation to be effective with a wide variety of speakers. A dynamic speaker selected training method developed for rapid speaker adaptation improves the basic speaker selected training method by replacing the absolute number selection method used in the basic method with a confidence measure calculated from the Gaussian mixture model likelihood. The new method enhances both the training speaker selecting efficiency and the selecting adaptability. The dynamic acoustic model, which uses different weightings for each training speaker so that they resemble the adapted speaker, further increases the recognition accuracy rate. Simulation show that the dynamic method improves the baseline recognition accuracy rate from 80.1% to 84.1%, with a decrease of 19.96% in the relative error rate. Thus, the dynamic method rapidly increases practical speech recognition system performance.

作者董明刘加刘润生

机构地区清华大学电子工程系

出处《清华大学学报（自然科学版）》 EI CAS CSCD 北大核心 2005年第7期912-915,共4页 Journal of Tsinghua University(Science and Technology)

基金国家自然科学基金资助项目(60272016)

关键词语音识别说话人快速自适应置信测度 speech recognition rapid speaker adaptation confidence measure

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献6

1Hazen Timothy J .A comparison of novel techniques for rapid speaker adaptation [J].Speech Communication,2000,31:15-33.
2Gauvain Jean-Luc,Lee Chin-Hui.Maximum a posteriori estimation for multivariate gaussian mixture observations of Markov chains [J].IEEE Trans SAP,1994,2:291-298.
3Leggetter C J,Woodland P C.Maximum likely- hood linear regression for speaker adaptation of continuous density hidden Markov models [J].Computer Speech and Language,1995,9(2):171-185.
4Padmanabhan M,Bahl L R,Nahamoo D,et al.Speaker clustering and transformation for speaker adaptation in speech recognition systems [J].IEEE Trans on Speech and Audio Processing,1998,6(1):71-77.
5WU Jian,CHANG Eric.Cohorts based Custom models for rapid speaker and dialect adaptation [A].Proc Eurospeech [C].Aalborg,Denmark:ISCA Press,2001,2:1261-1264.
6HUANG Chao,CHEN Tao,CHANG Eric.Speaker selection training for large vocabulary continuous speech recognition [A].Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing [C].Orlando,Florida:IEEE Press,2002.1:609-612.

同被引文献3

1陶湘.从功能关系讨论受迫振动的振幅[J].大学物理,1994,13(11):5-8. 被引量：11
2江林.消费者行为学[M].2版.北京:科学出版社,2006.
3潘萍萍.“王加”销售业绩差距原因探析及建言[J].黑龙江八一农垦大学学报,2013,25(6):103-107. 被引量：2

引证文献1

1张平,王乐新,李庆达,刘艳萍,朱文霞.声音接收器激发效果特性研究[J].黑龙江八一农垦大学学报,2016,28(4):135-139. 被引量：1

二级引证文献1

1刘畅,Panchenko A.Yu.,Slipchenko M.I..稳定的大气边界层雷达探测信号回波分量的波谱分析[J].黑龙江八一农垦大学学报,2018,30(1):114-117.

1丁玉国,刘加,刘润生.嵌入式系统上的实时语音识别算法[J].数据采集与处理,2005,20(3):302-305. 被引量：6
2吕萍,吴及,王作英,陆大.连续语音识别中的说话人快速自适应技术[J].清华大学学报（自然科学版）,2002,42(7):977-980. 被引量：4
3郑能恒,张亚磊,李霞.基于模型在线更新和平滑处理的音乐分割算法[J].深圳大学学报（理工版）,2011,28(3):271-275. 被引量：2
4宋鹏,王浩,赵力.基于混合Gauss归一化的语音转换方法[J].清华大学学报（自然科学版）,2013,53(6):757-761. 被引量：3
5罗森林,马海朝,周思永.数字信号处理系统实用电磁兼容技术[J].电测与仪表,1997,34(7):22-24. 被引量：2
6卢存伟.计算机控制系统实用抗干扰技术[J].自动化与仪表,1989,4(2):36-39.
7齐谊娜,徐海龙,王晓丹.H·264与MPEG-4压缩编码标准的分析与比较[J].计算机测量与控制,2006,14(12):1720-1722. 被引量：8
8李振,李飚,王鲁平,李伟.基于图像数据库的红外辐射模型生成系统[J].红外与激光工程,2007,36(z2):259-263.
9梁维谦,王国梁,刘加,刘润生.基于音素的发音质量评价算法[J].清华大学学报（自然科学版）,2005,45(1):5-8. 被引量：12
10魏旋,党晓妍,崔慧娟,唐昆.基于Gauss混合模型的清浊音解码端恢复算法[J].清华大学学报（自然科学版）,2010,50(1):79-82. 被引量：4

清华大学学报（自然科学版）

2005年第7期

浏览历史

内容加载中请稍等...

快速口音自适应的动态说话人选择性训练被引量：1

参考文献6

同被引文献3

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

快速口音自适应的动态说话人选择性训练 被引量：1

参考文献6

同被引文献3

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

快速口音自适应的动态说话人选择性训练被引量：1