期刊文献+

基于音素聚类的多语言声学建模方法 被引量:1

Multilingual Acoustic Modeling Method Based on Phoneme Clustering
原文传递
导出
摘要 首先提出以音素合并后模型自身似然度下降为距离依据,通过聚类生成多语言通用音素的声学建模方法.在此基础上,比较聚类时增加两种约束条件(同一语种内音素不聚类、不同IPA族的音素不聚类)对性能的影响.同时,对通用音素集的规模对识别性能的影响做了一定探索.最后的实验给出建立中英文双语混合模型在关键词检测系统上的结果,比较4种聚类方法在不同通用音素个数情况下的性能优劣.结果显示,使用本文方法进行一定程度的音素合并,性能比不作聚类直接混合建模有明显提升.适当增加音素聚类的约束,有助于进一步提高性能. A clustering method is proposed to generate muhilingual global phoneme based on the decrease of model self-likelihood. Two linguistic limitations are used in the clustering procedure, and the phonemes in same language or belonging to different international phonetic alphabet (IPA) classes are not merged. In telephone speech keyword spotting system, the performance of several Chinese-English bilingual model are compared which are generated by different phoneme clustering methods. The experimental results show that the merged phoneme set of an appropriate size can generate acoustic models with good quality, far above the results without merging. Moreover, the linguistic limitations added to clustering procedure can improve the performance.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2009年第1期86-90,共5页 Pattern Recognition and Artificial Intelligence
基金 国家863计划资助项目(No.2006AA010103)
关键词 多语言声学建模 音素聚类 关键词检测 Muhilingual Acoustic Modeling, Phoneme Clustering, Keyword Spotting
  • 相关文献

参考文献7

  • 1Liu Chen, Mclnar L. An Automated Linguistic Knowledge-Based Cross-Language Transfer Method for Building Acoustic Models for a Language without Native Training Data//Pwc of the 9th European Conference on Speech Communication and Technology. Lisbon, Portugal, 2005 : 1365 - 1368
  • 2于胜民,张树武,徐波.汉英双语混合声学建模方法研究[J].中文信息学报,2004,18(5):78-84. 被引量:4
  • 3Byme W, Beyedein P, Huerta J M, et al. Towards Language Inde- pendent Acoustic Modeling// Proc of the IEEE International Con- ference on Acoustics, Speech and Signal Processing. Istanbul, Turkey, 2000 : 1029 - 1032
  • 4Zgank A, Imperl B, Johansen F T, et al. Crosslingual Speech Recognition with Multilingual Acoustic Models Based on Agglomerative and Tree-Based Triphone Clustering //Proc of the 7th European Conference on Speech Communication and Technology. Aalborg, Denmark, 2001:2725-2729
  • 5Zgank A, Kacic Z, Vicsi K, et al. Crosslingual Transfer of Source Acoustic Models to Two Different Target Languages// Proc of the COST278 and ISCA Tutorial and Research Workshop on Robustness Issues in Conversational Interaction. Norwich, UK, 2004 : 19
  • 6Sooful J J, Botha E C. An Acoustic Distance Measure for Automatic Cross-Language Phoneme Mapping// Proc of the Pattern Recognition Association of South Africa. Franschhoek, South Africa, 2001 : 99 - 102
  • 7Tsai M Y, Lee L S. Pronunciation Variation Analysis Based on Acoustic and Phonemic Distance Measures with Application Examples on Mandarin Chinese//Proc of the Workshop on Automatic Speech Recognition and Understanding. Virgin Islands, USA, 2003 : 117 -122

二级参考文献14

  • 1Byrne. B., P. Beyerlein, J. M. Huerta et al., Towards Language Independent Acoustic Modeling[ A]. IEEE ICASSP [C], 2000, Istanbul, Turkey. 2:1029- 1032.
  • 2Adda-Decker M., Towards Multilingual Interoperability in Automatic Speech Recognition [ J], Speech Communication, 2001,35(1-2):5-20.
  • 3Wells, C.J., Computer-coded phonemic notation of individual languages of the European community [ J]. J. Int.Phonetic Assoc., 1989,19:32- 54.
  • 4Hieronymus, J.L., ASCH phonetic symbols for the world's languages Worldbet [ J]. J. Int. Phonetic Assoc., 1993,23.
  • 5IPA, The International Phonetic Association (revised to 1993) - IPA Chat [J]. J. Int. Phonetic Assoc., 1993,23.
  • 6Schultz T. and A. Waibel, Language- independent and language-adaptive acoustic modeling for speech recognition[J]. Speech Communication, 2001,35(1 - 2) :31 - 51.
  • 7Kohler J., Multilingual phone models for vocabulary-independent speech recognition tasks [J], Speech Communication, 2001,35( 1 - 2) :21 - 30.
  • 8Uebler U., Multilingual speech recognition in seven languages [J], Speech Communication, 2001,35(1 - 2):53-69.
  • 9Bin Ma and Qiang Huo. Benchmark results of triphone-based acoustic modeling on HKU96 and HKU99 putonghua corpora[ A], ISCSLP [ C ], 2000, 359 - 362.
  • 10Brian Mak and Etienne Bamard. Phone clustering using the bhattacharyya distance[ A], ICSLP [C], 1996,2005 -2008.

共引文献3

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部