期刊文献+

中英文混合孤立词识别系统声学建模方法研究 被引量:1

Acoustic Modeling Research of Chinese-English Bilingual Isolated Word Recognition System
下载PDF
导出
摘要 研究了中英文混合识别系统声学建模的方法,为改善识别效果及降低混合系统的模型参数规模,提出了一种基于状态时间对准的模型距离测度和声学知识相结合的中英文音素模型聚类算法,并与其他方法进行了比较。实验结果表明,在模型参数规模等同的条件下,该算法较之于模型直接合并有了显著的提高,比基于Bhattacharyya距离和似然度距离的做法也有了不同程度的改进。 The Chinese-English bilingual acoustic modeling methods are focused on. In order to improve the system performance and decrease the model complexity, a novel clustering algorithm based on STA (State-Time Alignment) and acoustic knowledge is proporsed. Contrast experiments show that, when the complexity of the model parameters is at the same level, the proposed phone model clustering approach outperforms the simple combination of language-dependent inventories. Furthermore, STA is also better than the methods based on Bhattacharyya distance and acoustic likelihood distance.
作者 吴鹏飞 刘加
出处 《电声技术》 2009年第11期68-71,共4页 Audio Engineering
基金 国家自然科学基金委员会与微软亚洲研究院联合资助项目(60776800) 国家高技术研究发展计划(863计划)项目(2006AA010101 2007AA04Z223 2008AA02Z414)
关键词 中英文混合 状态时间对准 音素模型聚类 Chinese-English bilingual STA phone model clustering
  • 相关文献

参考文献9

  • 1KOHLER J. Muhilingual phone models for vocabulary- independent speech recognition tasks [J]. Speech Communication, 2001,35 (1-2) : 21-30.
  • 2YU S M, HU S, ZHANG S W, et al. Chinese-English bilingual speech recognition[C]//Proceedings of NLP-KE'03. Beijing : IEEE Press, 2003:603-609.
  • 3YU S M, ZHANG S W, XU B. Chinese-English bilingual phone modeling for cross-language speech recognition[C]//Proceedings of ICASSP'04. Montreal:IEEE Press, 2004 : 917-920.
  • 4SEIDE F, WANG J C. Phonetic modeling in the Philips Chinese continuous speech recognition system [C]//Proceedings of ISCSLP'98. Singapore : [s.n.], 1998 : 54-59.
  • 5CHEN Y J, WU C H, CHIU Y H, et al. Generation of robust phonetic set and decision tree for Mandarin using chi-square testlng[J]. Speech Communication,2002, 38 (3-4) : 349-364.
  • 6IPA. The international phonetic association (revised to 1993) IPA chart[G]. [S.l.]:Journal of the International Phonetic Association, 1993.
  • 7ZHANG Q Q, PAN J L, YAN Y H. Mandarin-English bilingual speech recognition for real world music retrieval [C]//Proceedings of ICASSP'08. Las Vegas:IEEE Press, 2008 : 4253-4256.
  • 8HUANG C L, WU C H. Generation of phonetic units for mixed-language speech recognition based on acoustic and contextual analysis [J]. IEEE Trans. on Computers,2007,56(9):1245-1254.
  • 9UEBLER U. Multilingual speech recognition in seven languages [J]. Speech Communication, 2001,35 (1-2) : 53-69.

同被引文献12

  • 1DAVIS K H, BIDDULPH R, BALASHEK S. Automatic recognition of spoken digits[J]. The Journal of the Acoustical Society of America, 1952,24 (6) : 637-642.
  • 2VINTSYUK T K. Speech discrimination by dynamic pro-gramming[J]. Cybernetics and Systems Analysis, 1968,4(1):81-88.
  • 3HUANG X, ACERO A, ALLEVA F, et al. Microsoft windows highly intelligent speech recognizer: Whisper [C]//Proceedings of ICASSP 1995. [S.1.]: IEEE Press, 1995(1) :93-96.
  • 4BAHL L R, BROWN P F, SOUZA P V, et al. Maxi-mum mutual information estimation of hidden Markov model parameters for speech recognition[C]//Proceed-ings of ICASSP 1986.[S.1.]:IEEE Press, 1986:49-52.
  • 5JUANG B H, KATAGIRI S. Discriminative learning for minimum error classification[J]. IEEE Trans. Signal Processing, 1992,40(12) :3043-3054.
  • 6YOUNG S, ODELL J, WOODLAND P. Tree-based state tying for high accuracy acoustic modeling[J]. Pro-ceedings of ARPA Workshop on Human Language Technology, 1994, 12: 307-312.
  • 7ODELL J. The use of context in large vocabulary speech recognition[D]. Cambridge : Cambridge University, 1995.
  • 8POVEY D, KANEVSKY D, KINGSBURY B, et al. Boosted MMI for model and feature-space discrimina-tive training[C]//Proceedings of the IEEE International Conference on Acousitcs, Speech, and Signal Process-ing. [S.1.] : IEEE Press, 2008:4057-4060.
  • 9SAON G, POVEY D, SOLTAU H. Large margin semi-tied covariance transforms for discriminative training [C]//Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. [S.1.]: IEEE Press, 2009 : 3753-3756.
  • 10POVEY D, WOODLAND P. Minimum phone error and I-smoothing for improved discriminative training[C]// Proceedings of IEEE International Conference on Acoustics, Speech, and Singnal Processing. [S.1.]: IEEE Press, 2002.

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部