期刊文献+

基于发音特征的汉语声调建模方法及其在汉语语音识别中的应用 被引量:2

Improved tone modeling by exploiting articulatory features for Mandarin speech recognition
下载PDF
导出
摘要 发音特征表征了语音的发音方式信息,能够辅助传统的韵律特征改善声调建模的精度。在分析汉语声韵母发音特点的基础上,将发音方式划分为19类,并提出利用阶层式多层感知器计算语音信号属于各类的后验概率,作为发音特征。之后,将发音特征与传统的韵律特征一起用于声调建模。实验结果显示,加入发音特征后,在三种不同的建模方法下声调识别的准确率提升约5%。将声调模型融入大词表连续语音识别系统后,汉字错误率有了明显的下降。 Articulatory features, which represent the articulatory information, can help prosodic features to improve the performance of tone recognition. In this paper, a set of 19 pronunciation categories was given according to the pronunciation eharaeterlstics of initials and finals. Besides, 19 articulatory tandem features, which are the posteriors of speech signal belonging to the 19 pronunciation categories, were obtained by hierarchical muhilayer perceptron classifiers. Then these articulatory tandem features, as well as prosodic features, were used for tone modeling. Tone recognition experiments of three kinds of tone models indicate that about 5% absolute increase of accuracy can be achieved when using both articulatory features and prosodic features. When the proposed tone model is integrated into LVSCR ( Large Vocabulary Continuous Speech Recognition) system, the character error rate is reduced significantly.
出处 《计算机应用》 CSCD 北大核心 2013年第10期2939-2944,共6页 journal of Computer Applications
基金 国家自然科学基金资助项目(91120303 90820303 90820011)
关键词 语音识别 声调建模 发音特征 阶层式多层感知机分类器 speech recognition tone modeling Articulatory Feature (AF) Hierarchical muhilayer perceptron classifier
  • 相关文献

参考文献18

  • 1HUANG H, LI B H. Automatic context induction for tone model in- tegration in Mandarin speech recognition [ J]. Journal of China Uni- versities of Posts and Telecommunications, 2012, 19(1) : 94 - 100.
  • 2黄浩,朱杰.汉语语音识别中基于区分性权重训练的声调集成方法[J].声学学报,2008,33(1):1-8. 被引量:2
  • 3NI C J, LIU W J, XU B. Using prosody to improve Mandarin auto- matic speech recognition[ C]//Proceedings of the 11th Annum Con- ference of the International Speech Communication Association. Makuhari: ISCA, 2010:2690 - 2693.
  • 4TIAN Y, JIA J, WANG Y X, et al. A real-time tone enhancement method for continuous Mandarin speechs [ C]// Proceedings of the 8th International Symposium on Chinese Spoken Language Process- ing. Piscataway: IEEE, 2012:405 -408.
  • 5LEI X, OSTENDORF M. Word level tone modeling for Mandarin speech recognition [ C] // Proceedings of the 32th IEEE Internation- al Conference on Acoustics, Speech, and Signal Processing. Piscat- away: IEEE, 2007:665-668.
  • 6YANG W J, LEE J C, CHANG Y C, et al. Hidden Markov model for Mandarin lexical tone recognition[ J]. IEEE Transactions on A- coustic Speech and Signal Processing, 1988, 36(7):988 -992.
  • 7THUBTHONG N, KIJSIRIKUL B. Tone recognition of continuous Thai speech under tonal assimilation and declination effects using half-tone model[ J]. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2001, 9(6) : 815 -825.
  • 8曹阳,黄泰翼,徐波.基于统计方法的汉语连续语音中声调模式的研究[J].自动化学报,2004,30(2):191-198. 被引量:9
  • 9PENG G, WANG W S. Tone recognition of continuous Cantonese speech based on support vector machines [ J]. Speech Communica- tion, 2005, 45(1): 49-62.
  • 10WANG X H, YU Y S, WU X H. Maximum entropy based tone modeling for Mandarin speech recognition[ C]//Proceedings of the 35th IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2010:4850-4853.

二级参考文献31

  • 1王韫佳.音高和时长在普通话轻声知觉中的作用[J].声学学报,2004,29(5):453-461. 被引量:31
  • 2Huang Tai-Yi, Wang Cai-Fei, Yoh-Han Pao. Speech analysis for Chinese putonghua(Mandarin). In: Proceedings of IEEE International Conference on Acoustic, Speech, and Signal Processing. Atlanta: IEEE Press, 1981.1 : 370- 373.
  • 3Wu Zong-Ji. Tone variation in Chinese language. Chinese Language, 1982,28(6): 439-449(in Chinese).
  • 4Lee Lin-Shan, Tseng Chiu-Yu, Ming Ouh-Young. The synthesis rules in Chinese text-to-speech system. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1989,37(9) : 1309- 1320.
  • 5Lee Lin-Shan, Tseng Chiu-Yu, Hsieh Ching-Jiang. Improved tone concatenation rules in a formant-based Chinese text-to-speech system. IEEE Transactions on Speech and Signal Processing, 1993,1 (3) : 287 -294.
  • 6Wang Hsin-Min, Ho Tai-Hsuan, Yang Rung-Chiung et al. Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary but limited training data. IEEE Transactions on Speech and Audio Processing, 1997,5(2) : 195-200.
  • 7Cao Yang, Huang Tai-Yi, Xu Bo, Li Cheng-Rong. A stochastic polynomial tone model for continuous Mandarin speech. In: Proceedings of International Conference on Spoken Language Processing. Beijing: 2000.3:674-677.
  • 8Bahl L R, de Souza P V, Gopalakrishnan P S, Nahamoo D, Picheny M A. Decision tree for phonological rules in continuous speech. In: Proceedings of IEEE International Conference on Acoustic, Speech, and Signal Processing. Glasgow, Scotland: 1989.1:185-188.
  • 9Huang Tai-Yi, Wang Cai-Fei, Yoh-Han Pao. Speech analysis for Chinese putonghua(Mandarin). In: Proceedings of IEEE International Conference on Acoustic, Speech, and Signal Processing. Atlanta: IEEE Press, 1981.1:370~373
  • 10Wu Zong-Ji. Tone variation in Chinese language. Chinese Language,1982,28(6): 439-449(in Chinese)

共引文献9

同被引文献21

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部