期刊文献+

中文连续语音识别系统音素建模单元集的构建 被引量:2

Phoneme modeling units design for Mandarin LVCSR systems
原文传递
导出
摘要 在识别系统中,建模单元能够勾画一种语言的声学和语音学特性,因此对系统性能起到至关重要的作用。该文参照一些已在大词汇量连续语音识别系统(LVCSR)中取得较好效果的建模单元集,构建了新的音素建模单元集(Ne-wPS)。另外,根据NewPS中元音及其变体对前后接音素协同发音的影响,提出了基于扩展的元音三角图设计问题集(NewQS)的方法。实验表明:NewPS和NewQS结合的识别性能超越了传统的声韵母建模单元集;并且,建模单元数目大幅度的减少给系统后续模块的处理带来了便利。 Modeling units can be used to describe the salient acoustic and phonetic information for a language in speech recognition systems.Thus,they play a very important role in the system.This paper describes a phoneme set using several modeling units,which has good performance in large vocabulary continuous speech recognition(LVCSR) systems.A question set design method is given based on the extended vowel triangle.Tests show that the combination of the new phoneme set and the new question set surpasses the initial/final in performance.Also,the number of modeling units is greatly reduced which is more convenient for processing succeeding system modules.
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2011年第9期1288-1292,1297,共6页 Journal of Tsinghua University(Science and Technology)
关键词 大词汇量连续语音识别 建模单元 元音三角图 问题集 主元音准则 large vocabulary continuous speech recognition(LVCSR) modeling units vowel triangle question set main vowel principle
  • 相关文献

参考文献10

  • 1HUANG Chao, SHI Yu, tonal modeling for phone [C]// Proc ICASSP'04. Processing Society, 2004, ZHOU Jianlai, et al. Segmental set design in Mandarin LVCSR Montreal, Canada: IEEE Signal 1: 901 - 904.
  • 2CHANG Eric, ZHOU Jianlai, DI Shuo, et al. Large vocabulary Mandarin speech recognition with different approaches in modeling tones [C]// Proc ICSLP. Beijing, China: China Military Friendship Press, 2000, 2:983 - 986.
  • 3李净,徐明星.汉语连续语音识别中声学模型基元比较:音节、音素、声韵母[C].第六届全国人机语音通信会议,20014:267-280.
  • 4MA Bin, HUO Qiang. Benchmark results of triphone-based acoustic modeling on HKU96 and HKU99 Putonghua corpora [C]//Proc ISCSLP. Beijing, China, 2000: 359- 362.
  • 5XIANG Bing, Long Nguyen, GUO Xuefeng, et al. The BBN Mandarin broadcast news transcription system [C]// Proc Interspeech'05. Lisbon, Portugal, 2005: 1649-1652.
  • 6Hwang M Y, PENG Gang, highly accurate Mandarin WANG Wen, et al. Building a speech recognizer [C]// Proc ASRU'07. Kyoto, Japan: IEEE Signal Processing Society 2007, 490 - 495.
  • 7Hwang M Y, PENG highly accurate Gang, Ostendorf M, et al. Building a Mandarin speech recognizer with language independent technologies and language-dependent modules [J]. IEEE Trans on Audio, Speech and Language Processing, 2009, 17(7) : 1253 - 1262.
  • 8Plahl C, Hoffmeister B, Hwang M Y, et al. Recent improvements of the RWTH GALE Mandarin LVCSR system -C-// Proc Interspeech'08. Brisbane, Australia, 2008:2426 - 2429.
  • 9CHEN C J, LI Haiping, SHEN Liqin, et al. Recognize tone languages using pitch information on the main vowel of each syllable [C]// Proc ICASSP'01. Salt Lake City, UT, USA: IEEE Press, 2001: 61 -64.
  • 10Young S, Evermann G, Gales M, et al. The HTK Book (revised for HTK version 3.4) [M]. Cambridge: Cambridge University, 2006.

共引文献3

同被引文献13

  • 1NIST. The Spoken Term Detection(STD) 2006 Evaluation Plan[EB/OL]. (2006-09-16). http://www.itl.nist.gov/iad/mig/tests/ std/2006/docs/std06-evalplan-v 10.pdf.
  • 2Wang Dong, King S. Stochastic Pronunciation Modeling and Soft Match for Out-of-Vocabulary Spoken Term Detection[C]//Proc. of IEEE International Conference on Acoustics Speech and Signal Processing. [S. 1.]: IEEE Press, 2010.
  • 3Gouvea E. Subword Unit Approaches for Retrieval by Voice[C]// Proc. of IEEE International Conference on Acoustics Speech andSignal Processing. [S. 1.]: IEEE Press, 2010.
  • 4Hewlett D. Fully Unsupervised Word Segmentation with BVE and MDL[C]//Proc. of the 49th Annual Meeting of the Association for Computational Linguistics. Portland, USA: [s. n.], 2011.
  • 5Bisani M, Ney H. Open Vocabulary Speech Recognition with Flat Hybrid Models[C]//Proc. of the European Conference on Speech Communication and Technology. Lisboa, Portugal: [s. n.], 2005.
  • 6Akbacak M, Vergyri D, Stolcke A. Open-vocabulary Spoken Term Detection Using Graphone-based Hybrid Recognition Systems[C]// Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing. Las Vegas, USA: [s. n.], 2008.
  • 7Reddy S, Goldsmith J. An MDL-based Approach to Extracting Subword Units for Grapheme-to-Phoneme Conversion[C]//Proe. of Annual Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg, USA: ACM Press, 2010.
  • 8NIST. Nistg2p[EB/OL]. (2003-07-01). ftp://jaguar.ncsl.nist.gov// pub/addttp4-1.1 .tar.
  • 9Rissanen J. A Universal Prior for Integers and Estimation by Minimum Description Length[J]. The Annals of Statistics, 1983, 11(2): 416-431.
  • 10潘卫军,吴量,陈华群,罗晓利.空中交通无线电陆空通话错误分析[J].中国西部科技,2008,7(30):1-3. 被引量:10

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部