期刊文献+

基于词片的语言模型及在汉语语音检索中的应用 被引量:5

Study on performance optimization for Chinese speech retrieval
下载PDF
导出
摘要 在汉语语音检索研究中,为充分利用汉语中音节相互搭配的语言学知识,提出了一种新的汉语语言模型构造基元——"词片"(word fragment),研究了最佳词片选择算法。汉语语音识别实验和语音检索实验表明,采用基于词片的语音模型后,音节正确率有所提高,并取得了更好的语音检索性能。 A new unit, named word fragment of language model was proposed to take full advantage of the Chinese linguistic information among adjacent syllables, and an algorithm for word fragment selection was studied. The experimental results show, with the language model based on word fragment, syllable accuracy for recognizer is improved and the speech retrieval system gives better performance than the one with only syllable based model.
出处 《通信学报》 EI CSCD 北大核心 2009年第3期84-88,共5页 Journal on Communications
基金 国家重点基础研究发展计划("973"计划)基金资助项目(2007CB311100) 国家自然科学基金资助项目(60575030)~~
关键词 汉语语音检索 语言模型 词片 互信息 Chinese speech retrieval language model word fragment lattice
  • 相关文献

参考文献12

  • 1ABBERLEY D, RENALS S, COOK G. Retrieval of broadcast news documents with the THISL system[A]. Proc ICASSP98[C]. Seattle, 1998.3781-3784.
  • 2NG K, ZUE V W. Subword-based approaches for spoken document retrieval[J]. Speech Communication, 2000, 32:157-186.
  • 3BETH L, PEDRO M, OM D. Word and subword indexing approaches for reducing the effects of OOV queries on spoken audio[A]. Proc HLT2002[C]. San Diego, 2002.
  • 4WECHSLER M, SCHAUBLE P. Speech retrieval based on automatic indexing[A]. Proc MIRO '95[C]. Glasgow, 1995.
  • 5SEIDE F, et al. Vocabulary independent search in spontaneous speech[A]. Proc ICASSP'04[C]. Montreal, 2004.I253-I256.
  • 6SARACLAR M, SPROAT R. Lattice-based search for spoken utterance retrieval[A]. Proc HLT-NAACL 2004[C]. Boston, Massachusetts, USA, 2004.129-136.
  • 7HORI T, HETHERINGTON I L, HAZEN T J. Open-vocdalaryspoken utterance retrieval using confusion networks[A]. Proc ICASSP'07[C]. Honolulu, HI, USA, 2007.73-76.
  • 8BAI B R, CHEN B L, WANG H M. Syllable based Chinese text/spoken document retrieval[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2000, 14(5): 603-616.
  • 9BAI B R, WANG H M, LEE L S. Discriminating capabilities of syllable based features and approaches of utilizing them for voice retrieval of speech information in mandarin Chinese[J]. IEEE Trans Speech Audio Processing, 2002, 10(5): 303-314.
  • 10BEAUJARD C, JARDINO M. Language modeling based on automatic word concatenations[A]. Proceedings of European Conference on Speech Communication and Technology[C]. Budapest, Hungary, 1999.

同被引文献55

  • 1李素建,王厚峰,俞士汶,辛乘胜.关键词自动标引的最大熵模型应用研究[J].计算机学报,2004,27(9):1192-1197. 被引量:93
  • 2赵玉娟,水鹏朗,张凌霜.基于子空间匹配追踪的信号稀疏逼近[J].信号处理,2006,22(4):501-505. 被引量:9
  • 3杨琳,张建平,颜永红.特定领域的汉语语言模型平滑算法比较研究[J].计算机工程与应用,2006,42(32):14-16. 被引量:5
  • 4Good I J. The Population Frequencies of Species and the Estimation of Population Parameters. Biometrika, 1953, 40 (3/4) : 237 - 264.
  • 5Katz S M. Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer. IEEE Trans on Acoustics, Speech, and Signal Processing, 1987, 35 ( 3 ) : 400 - 401.
  • 6Gale W A, Sampson G. Good-Turing Frequency Estimation without Tears. Quantitative Linguistics, 1995, 2:217-237.
  • 7John M, Mostefa Mesbah, Boualem Boashash. A new discrete analytic signal for reducing aliasing in the discrete wigner-ville distribution [J]. IEEE Transactions on Signal Proees-sing, 2008, 56 (11): 5427-5434.
  • 8Peng Z K, Meng G, Chu F L, et al. Polynomial chirplet transform with application to instantaneous frequency estimation [J]. IEEE Transactions on Instrumentation and Measurement, 2011, 60 (9): 3222-3229.
  • 9Fabien Millioz, Nadine Martin. Circularity of the STFT and spectral kurtosis for time-frequency segmentation in Gaussian environment[J].IEEE Transactions on Signal Proce-ssing, 2011, 59 (2): 515-523.
  • 10Zakria Hussain, John Shawe-Taylor. Design and generalization analysis of orthogonal matching pursuit algorithms [J].IEEE Transactions on Information Theory, 2011, 57 ( 8 ): 5326-5340.

引证文献5

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部