期刊文献+

语料资源缺乏的连续语音识别方法的研究 被引量:9

Investigation of ASR Systems for Resource-deficient Languages
下载PDF
导出
摘要 由于少数民族语言有其本身的特点,不能简单地套用现有的连续语音识别的方法.本文以蒙古语为例,研讨了声学和语言模型的建立,并在日本国际电气通信基础技术研究所的连续语音识别器上实现了蒙古语的语音识别系统.本文侧重于语言模型的建立,基于蒙古语黏着性语言特点,提出用相似词聚类方法建立多类N-gram模型.实验结果显示,应用我们提出的语言模型,识别精度比用传统的词的N-gram识别法提高了5.5%. Because the minority languages in China have their special characteristics, it is not suitable to directly adopt the traditional automatic speech recognition (ASR) methods which are used for some major languages, such as Chinese, English, Japanese, etc. In this paper, we take Mongolian (a resource-deficient language) as an example and build the acoustic and language models for applying the ATRASR system. In this paper, we specially focus on the language modeling aspect by considering the special characteristics of the Mongolian. We trained a multi-class N-gram language model based on similar word clustering. By applying the proposed language model, the system could improve the performance by 5.5 % compared with the conventional word N-gram.
出处 《自动化学报》 EI CSCD 北大核心 2010年第4期550-557,共8页 Acta Automatica Sinica
基金 日本独立行政法人情报通信研究机构多语言高新技术语音–文本处理研究项目资助~~
关键词 蒙古语 黏着语言 相似词分类 连续语语音识别 多类语言模型 Mongolian language, agglutinative language, similar word clustering, continuous speech recognition, multiclass N-gram model
  • 相关文献

参考文献23

  • 1Young S J, Evermann G, Gales M J F, Hain T, Kershaw D, Moore G. The HTK Book, Version3.4. Berlin: Springer, 2006.
  • 2Kawahara T. Participants' areas of research and technical work [Online], avaiable: http://www.julius.scorceforge.jp/, March 17, 2009.
  • 3伊·达瓦,卢绪刚,清水微,中村哲.蒙古语连续语音识别在不同结构语言模型下的精度讨论.第十届全国人机语音通讯学术会议.兰州,中国:新疆师范大学出版社,2009.57.
  • 4陶梅,吾守尔.斯拉木,那斯尔江.吐尔逊.基于HTK的维吾尔语连续语音声学建模[J].中文信息学报,2008,22(5):56-59. 被引量:12
  • 5Schultz T, Waibel A. Experiments on cross-language acoustic modeling. In: Proceedings of the 7th European Conference on Speech Communication and Technology. Aalborg, Denmark: ISCA, 2001. 567-570.
  • 6Lee C H. Attribute-based universal phone modeling for multilingual speech recognition. In: Proceedings of the International Conference on Speech Databases and Assessments. Kyoto, Japan: NICT, 2008. 1-28.
  • 7伊.达瓦,大川茂树,白井克彦.蒙古语多方言语音识别及共享识别模型探索[J].中央民族大学学报(哲学社会科学版),2001,28(4):114-121. 被引量:3
  • 8Yamamoto H, Isogai S, Sagisaka Y. Multi-class composite N-gram language model using multiple word clusters and word successions. IEIC Technical Report, 2001, 101(156): 13-18.
  • 9National Bureau of Statistics of China. Report of the population [Online], avaiable: http://www.stats.gov. cn/enGliSH/, July 22, 2006.
  • 10Dawa I, Nakamura S. A study on cross trasformation of mongolian language. Journal of National Language Processing, 2008, 15(5): 3-21.

二级参考文献43

共引文献23

同被引文献62

  • 1鲍怀翘,阿西木.维吾尔语元音声学初步分析[J].民族语文,1988(5):4-13. 被引量:22
  • 2曹剑芬.语音处理上如何逐渐减少对具体语料的依赖?[J].清华大学学报(自然科学版),2009(S1):1380-1387. 被引量:3
  • 3伊.达瓦,张玉洁,上园一知,大川茂树,章森,井佐原均,白井克彦.蒙古语语言-文字的自动化处理[J].中文信息学报,2006,20(4):56-62. 被引量:6
  • 4崔朝阳,王建纲.广播电视语音识别现状与应用策略[J].计算机工程与应用,2007,43(23):181-183. 被引量:2
  • 5蒋鑫.基于Julian的语音关键词识别系统[DB/OL].[2009.2.13].
  • 6中国科技论文在线.刘建.可定制关键词识别系统的研究与实现[D].清华大学:计算机科学与技术系,2004.
  • 7Muhetaer Shadike, I.I Xiao, Buheliqiguli Wasili: Large Vocabulary Continuous Speech Recognition: Basic research of Trigram Language Model[C]//Yang Li. IC MCE (2010). Chengdu: IEEE Press. 2010:753-757.
  • 8Muhetaer Shadike, LI Xiao, Buheliqiguli Wasili: Large Vocabulary Continuous Speech Recognition: Basic research of Acoustic Model [C]//CSIE 2011. Changchun, IEEE Press. 2011.
  • 9Muhetaer Shadike, LI Xiao, Buheliqiguli Wasili: l.arge Vocabulary Continuous Speech Recognition:Basic re search of Front-end Processor[C]//NCIS'll. Guilin. IEEE Press. 2011.
  • 10Muhetaer Shadike, LI Xiao, Buheliqiguli Wasili: Large "Vocabulary Continuous Speech Recognition: Basic re search of Decoder[C]//ISNN2011. Guilin. Springer's LNCS Press. 2011: 594-600.

引证文献9

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部