期刊文献+

决策树结构对说话人自适应影响的研究 被引量:3

Research on the influence of decision tree structure on speaker adaptation
下载PDF
导出
摘要 提出一种利用自适应语料和训练语料对模型状态结构调整的算法。该算法在易混淆的状态间参数共享,提高了模型对样本的后验概率和对自适应语料的利用率,并间接地调整了系统决策树的结构。识别实验结果表明,在不同数量的自适应语句下,调整后的系统识别率比基线系统获得了一致的提高,结合使用MLLR说话人自适应,调整的系统识别率平均提高了 15.60%,有效地减少了测试语料与训练语料决策树结构不匹配造成的系统识别率降低。 Based on adaptation data and training data, a state restructuring method was proposed. In the method, parameters between confused states were shared, which improved the posterior probability, made better use of adaptation data and indirectly restructured the decision tree of the baseline. Experimental results showed when a varying number of adaptation sentences were taken from each speaker, restructured system increased recognition rate consistently compared with the baseline and achieved an average recognition increase of 15.60% by combining with MLLR speaker adaptation than MLLR alone. Such results proved the state-restructuring method could effectively reduce the recognition rate decreasing led by the difference between the decision tree structures of training data and testing data.
出处 《声学学报》 EI CSCD 北大核心 2006年第1期42-47,共6页 Acta Acustica
基金 上海市科学技术委员会基础研究项目(01JC14033)
关键词 说话人自适应 树结构 决策树 基线系统 结构调整 识别率 后验概率 MLLR 语料 利用率 Adaptive systems Algorithms Data processing Decision making Probability Trees (mathematics)
  • 相关文献

参考文献12

  • 1高升,徐波,黄泰翼.基于决策树的汉语三音子模型[J].声学学报,2000,25(6):504-509. 被引量:20
  • 2李虎生,刘加,刘润生.语音识别说话人自适应研究现状及发展趋势[J].电子学报,2003,31(1):103-108. 被引量:32
  • 3Reichl W, Chou W. Robust decision tree state tying for continuous speech recognition. IEEE Trans. Speech and Audio Processing, 2000; 8(5): 555-566
  • 4Lee C H, Lin C H, Juang B H, A study on speaker adaptation of the parameters of continuous density hidden Markov models. IEEE Trans. on Acoustic and Speech Signal Processing, 1991; 39(4): 806-814
  • 5Gauvain J L, Lee C H. Maximum a posterior estimation for multivariate Gausslan mixture observations of markov chains, IEEE Trans. on Speech and Audio Processing,1994; 2(2): 291-298
  • 6Leggetter G J, Woodland P C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Computer Speech and Language,1995; 9(2): 171-185
  • 7Reichl W, Chou W. A unified approach of incorporating general features in decision tree based acoustic modeling.ICASSP, 1999(2): 573-576
  • 8He X D, Zhao Y X, Fast model adaptation and complexity selection for nonnative English speakers, ICASSP, 2002(1):577-580
  • 9Dempster A P, Laird N M, gubin D B. Maximum likelihood from incomplete data via the EM algorithm. J. R.Star. Soc., B, 1977; 39:1-38
  • 10Forney Jr C D. The Viterbi algorithm. In: Proc. IEEE,1973; 62:268-278

二级参考文献9

  • 1李健,王作英.HMM转移概率的新的重估算法[J].电子学报,2001,29(z1):1833-1835. 被引量:5
  • 2林焘 王理嘉.语音学教程[M].北京:北京大学出版社,..
  • 3徐波 张亮 等.基于决策树方法的语境有关HMM建模.第八届全国声学学术会议[M].,1998.421-424.
  • 4张昊天.[D].北京:清华大学电子工程系,2000.
  • 5Hwang Meiyuh,IEEE Trans Speech Audio Processing,1998年,4卷,6期,412页
  • 6徐波,第八届全国声学学术会议,1998年,421页
  • 7Ma Bin,ICASSP ’96,USA,1996年
  • 8林杰,语音学教程
  • 9李虎生,杨明杰,刘润生.汉语数码语音识别自适应算法[J].电路与系统学报,1999,4(2):1-6. 被引量:4

共引文献53

同被引文献37

  • 1吕萍,颜永红.基于回归分析的语音识别快速自适应算法[J].声学学报,2005,30(3):222-228. 被引量:4
  • 2Padmanabhan M, Bahl LR, Picheny MA. Speaker clustering and transformation for speaker adaptation in largevocabulary speech recognition systems. ICASSP, 1996; 2: 701-704
  • 3Chen S, Gopalakrishnan P. Speaker, environment and channel change detection and clustering via the bayesian information criterion. Proceedings of the Broadcast News transcription and Understanding Workshop, Lansdowne, VA, 1998:127-132
  • 4Duda R, Hart P, Stork D. Pattern classification. John Wiley & Sons, Inc., Second Edition, 2001
  • 5Jin H, Kubala F, Schwartz R. Automatic Speaker Clustering. Proc. of DARPA. Speech Recognition Workshop, 1997:108-111
  • 6Kubala F, Sean Colbath, Daben Liu, Amit Srivastava, John Makhoul. Integrated technologies for indexing spoken language. Communications of the ACM, 200; 43(2): 48-56
  • 7Siegler M et al. Audio segmentation, classification and clustering in a broadcast news task. ICASSP, 2003:5-8
  • 8Siegler M, Jain U, Ray B. Automatic segmentation, classification and clustering of broadcast news audio. Proceedings of Speech Recognition Workshop, 1997:97-99
  • 9Gish H, Siu M, Rohicek R. Segregation of Speaker for Speech Recognition and Speaker Identification. ICASSP, 1991:873-876
  • 10Liu D, Kubala F. Online Speaker Clustering. ICASSP, 2004:333-336

引证文献3

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部