期刊文献+

汉语语音识别中基于音节的声学模型改进算法 被引量:1

Improved syllable-based acoustic modeling for continuous Chinese speech recognition
下载PDF
导出
摘要 针对汉语语音识别中协同发音现象引起的语音信号的易变性,提出一种基于音节的声学建模方法。首先建立基于音节的声学模型以解决音节内部声韵母之间的音变现象,并提出以音节内双音子模型来初始化基于音节声学模型的参数以缓解训练数据稀疏的问题;然后引入音节之间的过渡模型来处理音节之间的协同发音问题。在"863-test"测试集上进行的汉语连续语音识别实验显示汉语字的相对错误率下降了12.13%,表明了基于音节的声学模型和音节间过渡模型相结合在解决汉语协同发音问题上的有效性。 Concerning the changeability of the speech signal caused by co-articulation phenomenon in Chinese speech recognition, a syllable-based acoustic modeling method was proposed. Firstly, context independent syllable-based acoustic models were trained, and the models were initialized by intra-syllable IFs based diphones to solve the problem of training data sparsity. Secondly, the inter-syllable co-articulation effect was captured by incorporating inter-syllable transition models into the recognition system. The experiments conducted on “863-test” dataset show that the relative character error rate is reduced by 12.13%. This proves that syUable-based acoustic model and inter-syllable transition model are effective in solving co- articulation effect.
出处 《计算机应用》 CSCD 北大核心 2013年第6期1742-1745,共4页 journal of Computer Applications
基金 国家自然科学基金资助项目(91120303,90820303,90820011) 国家973计划项目(2004CB318105) 国家863计划项目(20060101Z4073,2006AA01Z194)
关键词 语音识别 协同发音 音变 声学建模 音节模型 speech recognition co-articulation acoustic variability acoustic modeling syllable model
  • 相关文献

参考文献14

  • 1周迅溢,王蓓,杨玉芳,李晓庆.语句中协同发音对音节知觉的影响[J].心理学报,2003,35(3):340-344. 被引量:10
  • 2SCHULTZ T, WAND M. Modeling coarticulation in EMG-based continuous speech recognition[ J]. Speech Communication, 2010, 53(4) : 341 -353.
  • 3GAO S, LEE T, WONG Y W, et al. Acoustic modeling for Chinese speech recognition: a comparative study of Mandarin and Cantonese [ C]//Proceedings of the 25th IEEE International Conference on A- coustics, Speech and Signal Processing. Piscataway: IEEE, 2000: 1261 - 1264.
  • 4ZHANG J, ZHENG F, LI j, et al. Improved context-dependent a- coustic modeling for continuous Chinese speech recognition [ C ]// Proceedings of the 7th European Conference on Speech Communica- tion and Technology. Aalborg: ISCA, 2001:1617 - 1620.
  • 5张辉,杜利民.汉语连续语音识别中不同基元声学模型的复合[J].电子与信息学报,2006,28(11):2045-2049. 被引量:7
  • 6LIU X, GALES M J F, HIERONYMUS J L, et al. Investigation of acoustic units for LVCSR systems [ C]// Proceedings of the 36th IEEE International Conference on Acoustics, Speech and Signal Pro- cessing. Piscataway: IEEE, 2011 : 4872 - 4875.
  • 7彭荻,刘刚,郭军.语音识别系统中上下文相关声学模型建模优化[J].北京邮电大学学报,2006,29(z2):188-191. 被引量:2
  • 8高升.语境相关的声学模型和搜索策略的研究[D].北京:中国科学院自动化研究所,2001.
  • 9GANAPATHIRAJU A, HAMAKER J, PICONE J, et al. Syllable- based large vocabulary continuous spoech recognition[ J]. IEEE Transactions on Speech Audio Processing, 2001,9(4):358 -366.
  • 10WU H, WU X H. Context dependent syllable acoustic model for eentinuous Chinese speech recognition [ C ]// The 13th European Conference on Speech Communication and Technology. Aalborg: ISCA, 2007:1713 - 1716.in_ 1994_ 2(4: 07-520_.

二级参考文献36

  • 1[1]Singh S, Woods D R, Becker G M. Perceptual structure of 22 prevocalic English consonants, Journal of American Acoustic Society, 1972, 52: 1668~1713
  • 2[3]Daniel Recasens. An electropalatographic and acoustic study of consonant-to-vowel coarticulation. Journal of Phonetics, 1991, 19: 177~192
  • 3[1]Yan Long,Zhao Rencai,Liu Gang,et al.Large vocabulary mandarin Chinese continuous speech recognition system based on tonal triphone[C]//International Symposium on Tonal Aspects of Languages.Beijing:[s.n.],2004:28-31.
  • 4[2]Young S J,Woodland P C.Tree-based state tying for high accuracy acoustic modeling[C]//Proc ARPA Human Language Tech Workshop Plainsboro.NJ:Morgan Kaufmann Publisher,1994:307-312.
  • 5[3]Zhu Xuan,Wang Runsheng,Chen Yining,et al.Acoustic model comparison for an embedded phonemebased mandarin name dialing system[C]//Proceedings of International Symposium on Chinese Spoken Language Processing.Taipei:Institute of China Computational Linguistics,2002:9-12.
  • 6[4]Wong Y W,Chang E.The effect of pitch and lexical tone on different mandarin speech recognition tasks[C]//Eurospeech 2001.Aalborg:[s.n.],2001:2741-2744.
  • 7[5]Reichl W,Chou W.Robust decision tree state tying for continuous speech recognition[J].IEEE Transactions.Speech and Audio Proc,2000,8(5):555-566.
  • 8[6]Chien J T,Huang C H,Chen S J.Compact decision trees with cluster validity for speech recognition[C]//ICCASP.Orlando:[s.n.],2002:2462-2465.
  • 9[7]Gao Sheng,Zhang Jinsong,Nakamura S,et al.Weighted graph based decision tree optimization for high accuracy acoustic modeling[C] //ICSLP.Denver:[s.n.],2002:1233-1236.
  • 10[8]Brieman L,Friedman J H,Olshen A,et al.Classification and regression trees[M].Monterrey:Wadsworth & Brooks,1984.

共引文献24

同被引文献3

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部