汉语语音识别中基于音节的声学模型改进算法被引量：1

Improved syllable-based acoustic modeling for continuous Chinese speech recognition

下载PDF

导出

摘要针对汉语语音识别中协同发音现象引起的语音信号的易变性,提出一种基于音节的声学建模方法。首先建立基于音节的声学模型以解决音节内部声韵母之间的音变现象,并提出以音节内双音子模型来初始化基于音节声学模型的参数以缓解训练数据稀疏的问题;然后引入音节之间的过渡模型来处理音节之间的协同发音问题。在"863-test"测试集上进行的汉语连续语音识别实验显示汉语字的相对错误率下降了12.13%,表明了基于音节的声学模型和音节间过渡模型相结合在解决汉语协同发音问题上的有效性。 Concerning the changeability of the speech signal caused by co-articulation phenomenon in Chinese speech recognition, a syllable-based acoustic modeling method was proposed. Firstly, context independent syllable-based acoustic models were trained, and the models were initialized by intra-syllable IFs based diphones to solve the problem of training data sparsity. Secondly, the inter-syllable co-articulation effect was captured by incorporating inter-syllable transition models into the recognition system. The experiments conducted on “863-test” dataset show that the relative character error rate is reduced by 12.13%. This proves that syUable-based acoustic model and inter-syllable transition model are effective in solving co- articulation effect.

作者晁浩杨占磊刘文举

机构地区河南理工大学计算机科学与技术学院中国科学院自动化研究所模式识别国家重点实验室

出处《计算机应用》 CSCD 北大核心 2013年第6期1742-1745,共4页 journal of Computer Applications

基金国家自然科学基金资助项目(91120303,90820303,90820011) 国家973计划项目(2004CB318105) 国家863计划项目(20060101Z4073,2006AA01Z194)

关键词语音识别协同发音音变声学建模音节模型 speech recognition co-articulation acoustic variability acoustic modeling syllable model

分类号 TP391.42 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献14

1周迅溢,王蓓,杨玉芳,李晓庆.语句中协同发音对音节知觉的影响[J].心理学报,2003,35(3):340-344. 被引量：10
2SCHULTZ T, WAND M. Modeling coarticulation in EMG-based continuous speech recognition[ J]. Speech Communication, 2010, 53(4) : 341 -353.
3GAO S, LEE T, WONG Y W, et al. Acoustic modeling for Chinese speech recognition: a comparative study of Mandarin and Cantonese [ C]//Proceedings of the 25th IEEE International Conference on A- coustics, Speech and Signal Processing. Piscataway: IEEE, 2000: 1261 - 1264.
4ZHANG J, ZHENG F, LI j, et al. Improved context-dependent a- coustic modeling for continuous Chinese speech recognition [ C ]// Proceedings of the 7th European Conference on Speech Communica- tion and Technology. Aalborg: ISCA, 2001:1617 - 1620.
5张辉,杜利民.汉语连续语音识别中不同基元声学模型的复合[J].电子与信息学报,2006,28(11):2045-2049. 被引量：7
6LIU X, GALES M J F, HIERONYMUS J L, et al. Investigation of acoustic units for LVCSR systems [ C]// Proceedings of the 36th IEEE International Conference on Acoustics, Speech and Signal Pro- cessing. Piscataway: IEEE, 2011 : 4872 - 4875.
7彭荻,刘刚,郭军.语音识别系统中上下文相关声学模型建模优化[J].北京邮电大学学报,2006,29(z2):188-191. 被引量：2
8高升.语境相关的声学模型和搜索策略的研究[D].北京:中国科学院自动化研究所,2001.
9GANAPATHIRAJU A, HAMAKER J, PICONE J, et al. Syllable- based large vocabulary continuous spoech recognition[ J]. IEEE Transactions on Speech Audio Processing, 2001,9(4):358 -366.
10WU H, WU X H. Context dependent syllable acoustic model for eentinuous Chinese speech recognition [ C ]// The 13th European Conference on Speech Communication and Technology. Aalborg: ISCA, 2007:1713 - 1716.in_ 1994_ 2(4: 07-520_.

二级参考文献36

1[1]Singh S, Woods D R, Becker G M. Perceptual structure of 22 prevocalic English consonants, Journal of American Acoustic Society, 1972, 52: 1668～1713
2[3]Daniel Recasens. An electropalatographic and acoustic study of consonant-to-vowel coarticulation. Journal of Phonetics, 1991, 19: 177～192
3[1]Yan Long,Zhao Rencai,Liu Gang,et al.Large vocabulary mandarin Chinese continuous speech recognition system based on tonal triphone[C]//International Symposium on Tonal Aspects of Languages.Beijing:[s.n.],2004:28-31.
4[2]Young S J,Woodland P C.Tree-based state tying for high accuracy acoustic modeling[C]//Proc ARPA Human Language Tech Workshop Plainsboro.NJ:Morgan Kaufmann Publisher,1994:307-312.
5[3]Zhu Xuan,Wang Runsheng,Chen Yining,et al.Acoustic model comparison for an embedded phonemebased mandarin name dialing system[C]//Proceedings of International Symposium on Chinese Spoken Language Processing.Taipei:Institute of China Computational Linguistics,2002:9-12.
6[4]Wong Y W,Chang E.The effect of pitch and lexical tone on different mandarin speech recognition tasks[C]//Eurospeech 2001.Aalborg:[s.n.],2001:2741-2744.
7[5]Reichl W,Chou W.Robust decision tree state tying for continuous speech recognition[J].IEEE Transactions.Speech and Audio Proc,2000,8(5):555-566.
8[6]Chien J T,Huang C H,Chen S J.Compact decision trees with cluster validity for speech recognition[C]//ICCASP.Orlando:[s.n.],2002:2462-2465.
9[7]Gao Sheng,Zhang Jinsong,Nakamura S,et al.Weighted graph based decision tree optimization for high accuracy acoustic modeling[C] //ICSLP.Denver:[s.n.],2002:1233-1236.
10[8]Brieman L,Friedman J H,Olshen A,et al.Classification and regression trees[M].Monterrey:Wadsworth & Brooks,1984.

共引文献24

1王云丽.上海普通话双音节词基础元音的大样本声学分析[J].语言学论丛,2019(2):246-264.
2赵博,蔡莲红.合成语音自然度客观测度[J].计算机工程与应用,2005,41(7):32-33. 被引量：2
3张传禄.工作也要“普遍怀疑”[J].出版参考（新阅读）,2006(7):45-45.
4李生,赵铁军.Chinese Information Processing and Its Prospects[J].Journal of Computer Science & Technology,2006,21(5):838-846. 被引量：1
5周忠诚,王孟杰,于水源.汉语双音节中第一音节的元音共振峰轨迹研究[J].电声技术,2007,31(3):8-10. 被引量：2
6ZHAO Jian,DONG Yuan,ZHAO Xian-yu,YANG Hao,WANG Hai-la.Cross similarity measurement for speaker adaptive test normalization in text-independent speaker verification[J].The Journal of China Universities of Posts and Telecommunications,2008,15(2):130-134.
7袁里驰.基于改进的隐马尔科夫模型的语音识别方法[J].中南大学学报（自然科学版）,2008,39(6):1303-1308. 被引量：19
8陈彩云,魏胜非.柴油机尾气处理中的湿度检测融合算法[J].东北师大学报（自然科学版）,2009,41(4):82-85. 被引量：2
9曹雁华,张扬.中国英语学习者词间协同发音初探[J].现代交际,2010(4):145-146.
10邵健,赵庆卫,颜永红.基于鼻韵尾分离的汉语声韵母识别模型[J].声学学报,2010,35(5):587-592. 被引量：3

同被引文献3

1赵以宝,孙圣和.一种基于TTRNN的汉语拼音全音节识别方法[J].哈尔滨工业大学学报,2001,33(2):213-216. 被引量：1
2周楠,赵悦,李要嫱,徐晓娜,才旺拉姆,吴立成.基于瓶颈特征的藏语拉萨话连续语音识别研究[J].北京大学学报（自然科学版）,2018,54(2):249-254. 被引量：9
3王作英,肖熙.基于段长分布的HMM语音识别模型[J].电子学报,2004,32(1):46-49. 被引量：42

引证文献1

1张经,杨健,苏鹏.语音识别中单音节识别研究综述[J].计算机科学,2020,47(S02):172-174. 被引量：1

二级引证文献1

1肖林,肖倩宏,魏莉莉,周艳云,汪适.基于大数据和深度学习的电网调度语音识别声学模型研究[J].电力大数据,2022,25(9):30-36. 被引量：2

1Q&A用户交流[J].个人电脑,2011,17(3):99-101.
2王磊.专家释疑[J].大众硬件,2006(9):137-139.
3QQ影音播放器声音变得很小[J].电脑爱好者,2012(2):61-61.
4解决Win7下QQ来消息播放器声音变小的问题[J].计算机与网络,2010,36(1):32-32.
5梁华.让拼音变得更美丽[J].电脑爱好者,2003(7):42-42.
6熊鹰,艾唐.使用软件效果器将声音变宽广[J].演艺科技,2013(3):62-63. 被引量：2
7空格.耳朵不能分享，公开双音轨分离的秘密[J].玩电脑,2005(8):37-39.
8李净,郑方,张继勇,吴文虎.汉语连续语音识别中上下文相关的声韵母建模[J].清华大学学报（自然科学版）,2004,44(1):61-64. 被引量：18
9吴翠娟,赵晖.可视化协同发音合成研究综述[J].现代计算机,2014,20(9):9-14.
10杨汝洁.云计算在电子政务中的应用研究[J].中共贵州省委党校学报,2013(3):40-43. 被引量：2

计算机应用

2013年第6期

浏览历史

内容加载中请稍等...

汉语语音识别中基于音节的声学模型改进算法被引量：1

参考文献14

二级参考文献36

共引文献24

同被引文献3

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

汉语语音识别中基于音节的声学模型改进算法 被引量：1

参考文献14

二级参考文献36

共引文献24

同被引文献3

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

汉语语音识别中基于音节的声学模型改进算法被引量：1