期刊文献+

自然语料缺乏的民族语言连续语音识别 被引量:2

Continuous Speech Recognition for Natural Resource-deficient Minority Languages
下载PDF
导出
摘要 以维吾尔语为例研究自然语料缺乏的民族语言连续语音识别方法。采用HTK通过人工标注的少量语料生成种子模型,引导大语音数据构建声学模型,利用palmkit工具生成统计语言模型,以Julius工具实现连续语音识别。实验用64个维语母语者自由发话的6 400个短句语音建立单音素声学模型,由100 MB文本、6万词词典生成基于词类的3-gram语言模型,测试结果表明,该方法的识别率为72.5%,比单用HTK提高4.2个百分点。 This paper discusses a continuous speech recognition approach for the resource-deficient languages,such as Uyghur.This approach tries a seed acoustic model using small training speech materials and creates final acoustic model using a larger speech materials and its transcription text by leading seed model.HTK is used for training acoustic model,and palmkit is used for creating language model,and the open-source speech recognition software Julius is applied for continuous speech recognition.In the test,the speech data of 6 400 sentences uttered by 64 native Uyghur speakers is used for training acoustic model and 100 MB text materials and a dictionary of 60 000 words are used for creating 3-garm language model based class.Experimental results show the rate of 72.5% for the real time sound recognition compared with the recognition result of 68.3% by HTK tool only.
出处 《计算机工程》 CAS CSCD 2012年第12期129-131,135,共4页 Computer Engineering
基金 国家自然科学基金资助面上项目(2011211A012 60863008) 新疆维吾尔自治区科技支疆基金资助项目(201091106) 博士启动基金资助项目(BS090144)
关键词 连续语音识别 种子模型 声学模型 语言模型 维吾尔语 continuous speech recognition seed model acoustic model language model Uyghur
  • 相关文献

参考文献7

二级参考文献50

共引文献31

同被引文献21

  • 1曹剑芬.语音处理上如何逐渐减少对具体语料的依赖?[J].清华大学学报(自然科学版),2009(S1):1380-1387. 被引量:3
  • 2张钹.计算机视听觉一人工智能的梦[R].第十届全国人机语音通讯学术会议暨国际语音语言处理研讨会论文集.特邀报告,2009,8:2-3.
  • 3Satoshi Nakamura. Development and Application of Multilingual Speech Translation[C]. Oriental COCOSDA International Conference on Speech Database and Assessments. IEEE.2009:1-4.
  • 4Wang Haifeng. Hybrid Method for Spoken Language Translation[R]. 7Th National Conference onMan-Machine Speech Communication and International Workshop on speech and language processing. Invited lecture. 2009.8:5.
  • 5David Geer. Statistical Machine Translation Gains Respect[C]. IEEE Computer Society Press 2005~38(10):18-25.
  • 6刘鹏,宗成庆.基于人机交互的统计翻译方法[C].机器翻译研究进展第四节全国机器翻译研讨会,2008:187-195.
  • 7Roberto Togneri and Daniel Pullella. An Overview of Speaker Identification: Accuracy and Robustness Issues [J]. IEEE CIRCUST AND SYSTEM MAGAZINE. 2011,SECOND QUARTER:23-45.
  • 8Dawa Yidemucao,Zhicheng Zhao, and Wushor.Silamu. Sound Scene Clustering without Prior Knowledge[C]. Chinese Con- ference Pattern Recognition, 2012,613-621.
  • 9Li Zinai, Ye Azhong. Advanced Applied Econometrics[M]. Beijing: Tsinghua University Press, 2012.
  • 10荒木雅弘.フリーソフトでつくる音声認識[M].東京.森北出版株式会社,2007.

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部