期刊文献+

连续语音识别中声学建模的组合聚类算法研究 被引量:5

A Combined Clustering Algorithm of Acoustic Modelling for Continuous Speech Recognition
下载PDF
导出
摘要 基于三音子连续语音识别的一个关键问题是在有限训练数据的条件下对大量声学模型参数的鲁棒性估计。为了解决这个问题 ,有两个主要的上下文相关的聚类算法被提出 ,它们是合并 (AgglomerativeClustering)聚类 (AGG)和决策树 (Tree based)聚类 (TB)。本文分析了这两种算法的优缺点 ,并分别对其进行了改进 ,然后提出了最大似然框架下组合聚类算法。大词汇量连续语音识别 (LVCSR)的实验结果表明 ,和单一的决策树聚类算法比较 ,提出的组合聚类算法对识别率有显著的提高。 A crucial issue in triphone-based continuous speech recognition is the large number of parameters to be estimated against the limited availability of training data. To cope with the problem, two major context-clustering methods, agglomerative (AGG) and tree-based (TB), have been widely investigated. We analyze both algorithms with respect to their advantage and disadvantage, develop several methods to improve on them, and introduce a novel combined method in the maximum likelihood framework. For LVCSR, the experimental results show the performance can be much improved by using the proposed combined method, compared with those of the existing TB method alone.
出处 《中文信息学报》 CSCD 北大核心 2003年第4期33-38,共6页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目 (6 9835 0 0 3) 973资助项目 (G19980 30 0 5 0 4 )
关键词 连续语音识别 声学建模 组合聚类算法 合并聚类 决策树聚类 鲁棒性估计 computer application Chinese information processing speech recognition agglomerative clustering decision tree-based clustering acoustic modeling
  • 相关文献

参考文献12

  • 1S.J. Young, P.C. Woodland, State Clustering in Hidden Markov Model-based Continuous Speech Recognition[ J ], Computer, Speech and Language, Oct 1994, 8 (4) : 369 - 384.
  • 2P.C. Woodland, J.J. Odell, V. Valtchev, S.J. Young, Large vocabulary CSR using HTK [ C],ICASSP'94,125 - 128.
  • 3M. Y. Hwang, X.D. Huang, and F. Alleva, Predicting unseen triphones with senones[C], ICASSP'93,311 - 314.
  • 4S.J. Young, J.J. Odell, P.C. Woodland, Tree-Based State Tying for High Accuracy Acoustic Modelling[C], In Proc. Human Language Technology Workshop, March 1994,307- 312.
  • 5Xuedong Huang, Alex Acero, and Hsiao-Wuen Hon. Spoken Language Processing: A Guide to Theory,Algorithm, and System Development[M], Prentice Hall PTR, 2001.
  • 6J. M. Huerta and R. M. Stem, Distortion-Class Modeling for Robust Speech Recognition under GSM RPE-LTP Coding[J ], in Speech Communication, 2001,34( 1 - 2) :213 - 225.
  • 7V. Digalakis, P. Monaco and H. Murveit, Genones: generalized mixture tying in continuous hidden markovmodel-based speech recognizers [J ], IEEE Transactions on Speech and Audio Processing, July 1996,4, (4) :281 - 289.
  • 8W. Reichl and W. Chou, Robust Decision Tree State Tying for Continuous Speech Recngnition[J], IEEE Trans. Speech and Audio Proc. , 2000,8(5) :555 - 566.
  • 9J. Park H. Ko, CONSTRUCTION OF DECISION TREE FROM DATA DRIVEN CLUSTERING[ C], ICSLP 2002,2657 - 2660.
  • 10J. T Chien, C. H Huang, and S. J Chen, COMPACT DECISION TREES WITH CLUSTER VALIDITY FOR SPEECH RECOGNITION[C], ICCASP 2002,2462 - 2465.

同被引文献57

引证文献5

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部