期刊文献+

基于音素混淆网络的蒙古语语音关键词检测方法的研究

Research on Mongolian Spoken Term Detection Based on Phoneme Confusion Network
下载PDF
导出
摘要 蒙古语语音识别系统的词表很难覆盖所有的蒙古文单词,并且随着社会的发展,蒙古文的新词和外来词也越来越多。为了解决蒙古语语音关键词检测系统中的集外词检测问题,该文提出了基于音素混淆网络的蒙古语语音关键词检测方法,并采用音素混淆矩阵改进了关键词的置信度计算方法。实验结果表明,基于音素混淆网络的蒙古语语音关键词检测方法可以较好地解决集外词的检测问题。蒙古语语音关键词检测系统采用改进的置信度计算方法后精确率提高了6%,召回率提高了2.69%,性能得到明显的提升。 To deal with Out-of-Vocabulary detection on Mongolian spoken term detection system, this paper propo ses a Mongolian spoken term detection method based on phoneme confusion network. The Confidence Measure is im- proved by incorporating phoneme confusion matrix. Experimental. results show that our method obtains a satisfying performance in the task of Mongolian Out-of-Vocabulary detection, with 6% improvement in precision rate and 2. 69% in recall rate.
出处 《中文信息学报》 CSCD 北大核心 2015年第1期178-182,共5页 Journal of Chinese Information Processing
基金 国家自然科学基金(61263037 71163029) 内蒙古自然科学基金(2014BS0604) 内蒙古大学高层次人才引进科研项目
关键词 蒙古语 关键词检测 集外词 混淆网络 音素混淆矩阵 Mongolian spoken term detection Out-o{-Vocabulary word confusion network phoneme confusion matrix
  • 相关文献

参考文献11

  • 1Feilong Bao, Guanglai Gao. The Research on Mongo- lian Spoken Term Detection Based on Confusion Net- work[C]//Proceedings of the Chinese Conference on Pattern Recognition (CCPR2012). Beijing, 2012 ; 606- 612.
  • 2Feilong Bao, Guanglai Gao. Improving of Acoustic Model for the Mongolian Speech Recognition System [C]//Proceedings of the Chinese Conference on Pat tern Recognition (CCPR2009). Nanjing, 2009: 616- 620.
  • 3Feilong Bao, Guangiai Gao, Xueliang Yan. Segmenta- tion-based Mongolian LVCSR Approach[C]//Proeeed ings of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP2013), Van- couver, 2013.. 8136-8139.
  • 4J Mamou, B Ramabhadran and O Siohan. Vocabulary independent spoken term detection[C]//Proceedings of the ACM-SIGIR'07. Amsterdam, 2007..615-622.
  • 5Ville T. Turunen and Mikko Kurimo, Indexing Confu-sion Networks for MorPh-based Spoken Document Re- trieval [C]//Proceedings of the ACM-SIGIR'07. Am- sterdam, 2007 : 631-638.
  • 6D Wang. Out-of-vocabulary spoken term detection IDa. Ph.[ D]. dissertation University of Edinburgh. 2010.
  • 7G Gosztolya and L Toth. Spoken term detection based on the most probable phoneme sequence[C]//Proceed- ings of the 2011 International Symposium on Applied Machine Intelligence and Informatics ( SAMI ) (IEEE), Slovakia, 2011 : 101-106.
  • 8飞龙,高光来,闫学亮.蒙古文字母到音素转换方法的研究[J].计算机应用研究,2013,30(6):1696-1700. 被引量:4
  • 9L Mangu, E Brill, and A Stolcke: Finding consensus in speech recognition: word error minimization and other applications of confusion networks [J]. Comput- er Speech and Language, 2000, 14(4): 373-400.
  • 10Young S, et al. The HTK book (Revised for HTK version 3.4.1)[M]. Cambridge University. 2009.

二级参考文献12

  • 1MENG H M, SENEFF S, ZUE V W. Phonological parsing for bi-directional letter-to-sound / sound-to-letter generation [ C ]//Proc of Workshop on Human Language Technology. 1994: 289-294.
  • 2TORKKOLA K. An efficient way to learn English grapheme-to-phoneme rules automatically[ C ]//Proc of IEEE International Conference on Acoustics, Speech, and Signal Processing. 1993:199-202.
  • 3BAGSHAW P C. Phonemic transcription by analogy in text-to-speech synthesis: novel word pronunciation and lexicon compression [ J ]. Computer Speech & Language, 1998,12 ( 2 ) : 119-142.
  • 4MENG H. A hierarchical lexical representation for bi-directional spelling-to-pronunciation/pronunciation-to-spelling generation [ J ]. Speech Communication,2001,33(3) : 213-239.
  • 5BISANI M, NEY H. Muhigram-based graphenae-to-phoneme conversion for LVCSR [ C ]//Proc of INTERSPEECH. 2003 : 933- 936.
  • 6BELLEGARDA J R. Unsupervised, language-independent grapheme- to-phoneme conversion by latent analogy[ J]. Speech Gommunieation ,2005,46 (2) : 140-152.
  • 7WANG Dong. Out-of-vocabulary spoken term detection [ D ]. Edinburgh : University of Edinburgh. 2010.
  • 8TAYLOR P. Hidden Markov models for grapheme to phoneme conversion[ C ]//Proc of INTERSPEECH. 2005 : 1973-1976.
  • 9BISANI M, NEY H. Joint sequence models for grapheme-to-phoneme conversion [ J]. Speech Communication ,2008,50 ( 5 ) :434-451.
  • 10BAO Fei-long, GAO Guang-lai. Improving of acoustic model for the mongolian speech recognition system [ C ]//Proc of Chinese Conference on Pattern Recognition. 2009: 616-620.

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部