语音识别中基于发音特征的声调集成算法被引量：2

Integrating tone models into speech recognition system based on articulatory feature

下载PDF

导出

摘要提出基于发音特征的声调建模改进方法,并将其用于随机段模型的一遍解码中。根据普通话的发音特点,确定了用于区别汉语元音、辅音信息的7种发音特征,并以此为目标值利用阶层式多层感知器计算语音信号属于发音特征的35个类别后验概率,将该概率作为发音特征与传统的韵律特征一起用于声调建模。根据随机段模型的解码特点,在两层剪枝后对保留下来的路径计算其声调模型概率得分,加权后加入路径总的概率得分中。在"863-test"测试集上进行的实验结果显示,使用了新的发音特征集合中声调模型的识别精度提高了3.11%;融入声调信息后随机段模型的字错误率从13.67%下降到12.74%。表明了将声调信息应用到随机段模型的可行性。 The tone model based on articulatory features is improved in this paper, and a framework is proposed which attempts to integrate the proposed tone model into stochastic segment based Mandarin speech recognition system. A set of seven articulatory features which represent the articulatory information is given. As well as prosodic features, the posteriors of speech signal belonging to the 35 pronunciation categories of articulatory features are used for tone modeling. The tone models are fused into the SSM-based speech recognition system after second pruning according to the property of segmental models. Tone recognition experiments conducted on“863-test”set indicate that about 3.11% absolute increase of accuracy can be achieved when using new articulatory features. When the proposed tone model is integrated into SSM system, the character error rate is reduced significantly. Thus, potential of the method is demonstrated.

作者晁浩宋成刘志中

机构地区河南理工大学计算机科学与技术学院

出处《计算机工程与应用》 CSCD 2014年第23期21-25,共5页 Computer Engineering and Applications

基金国家自然科学基金(No.61300124) 河南省基础与前沿技术研究计划资助项目(No.132300410332)

关键词语音识别随机段模型声调建模发音特征阶层式多层感知器 speech recognition stochastic segment modeling tone modeling articulatory feature hierarchical multilayer perceptron classifiers

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献15

1Ostendorf M,Roukos S.A stochastic segment model for phoneme-based continuous speech recognition[J].IEEE Trans on Speech and Audio Processing,1989,37(12):1857-1869.
2唐赟,刘文举,徐波.基于后验概率解码段模型的汉语语音数字串识别[J].计算机学报,2006,29(4):635-641. 被引量：12
3晁浩,杨占磊,刘文举.汉语语音识别中声学界标点引导的随机段模型解码算法[J].计算机科学,2013,40(10):208-212. 被引量：1
4Tang Yun,Liu Wenju,Zhang Hua.One-pass coarse-to-fine segmental speech decoding algorithm[C]//Proceedings of ICASSP,2006:441-444.
5HUANG Hao,LI Bing-hu.Automatic context induction for tone model integration in mandarin speech recognition[J].The Journal of China Universities of Posts and Telecommunications,2012,19(1):94-100. 被引量：1
6Tian Ye,Jia Jia,Wang Yongxin,et al.A real-time tone enhancement method for continuous Mandarin speeches[C]//International Symposium on Chinese Spoken Language Processing,2012:405-408.
7Wu Jiang,Zahorian S A,Hu Hongbing.Tone recognition in continuous Mandarin Chinese[J].The Journal of the Acoustical Society of America,2012,132(3).
8Wu Jiang,Zahorian S A,Hu Hongbing.Tone recognition for continuous accented Mandarin Chinese[C]//Proceedings of ICASSP,2013:7180-7183.
9Yang W J,Lee J C,Chang Y C,et al.Hidden Markov model for Mandarin lexical tone recognition[J].IEEE Transactions on Acoustic Speech and Signal Processing,1988,36(7):988-992.
10Thubthong N,Kijsirikul B.Tone recognition of continuous Thai speech under tonal assimilation and declination effects using half-tone model[J].International Journal of Uncertainty,Fuzziness and Knowledge-Based Systems,2001,9(6):815-825.

二级参考文献56

1唐赟,刘文举,徐波.基于后验概率解码段模型的汉语语音数字串识别[J].计算机学报,2006,29(4):635-641. 被引量：12
2Gunawardana A,Hahajan M,Acero A,et al. Hidden conditional random fields for phone classification. Proceedings of the 9th European Conference on Speech Communication and Technology (EuroSpeech'05),Sep 4-8,2005,Lisbon,Portugal. 2005:1117-1120.
3Quattoni A,Wang S,Morency L P,et al. Hidden conditional random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2007,29(10):1848-1852.
4Huang C H,Side F. Pitch tracking and tone features for mandarin speech recognition. Proceedings ofthe 25th International Conference on Acoustics,Speech,and Signal Processing (ICASSP'00):Vol 3,Jun 5-9,2000,Istanbul,Turkey. Piscataway,N J,USA:IEEE,2000:1523-1526.
5Lei X,Siu M H,Hwang M,et al. Improved tone modeling for mandarin broadcast news speech recognition. Proceedings of the 7th International Conference on Spoken Language Processing (InterSpeech/ICSLP'06),Sep 17-21,2006,Pittsburgh,PA,USA. 2006:1277-1280.
6Wang H L,Qian Y,Soong F K,et al. Improved mandarin spench recognition by lattice rescoring with enhanced tone models. Proceedings of the 5th International Symposium on Chinese Spoken Language Processing (ISCSLP'06),Dec 13-16,2006,Singapore. LNAI 4274. Berlin,Germany:Springer-Verlag,2006:445-453.
7Beyerlein P. Discriminative model combination. Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU'07),Dec 17,2007,Santa Barbara,CA,USA. Piscataway,NJ,USA:IEEE,1997:238-245.
8Huang H,Zhu J. Discriminative incorporation of explicitly trained tone models into lattice based rescoring for mandarin speech recognition. Proceedings of the 33rd International Conference on Acoustics,Speech,and Signal Processing (ICASSP'08),Mar 31-Apr 4,2008,Las Vegas,NV,USA,Piscataway,NJ,USA:IEEE,2008:1541-1544.
9Hoffmeister B,Liang R,Schlüter R,et al. Log-linear model combination with word-dependent scaling factors. Proceedings of the 10th International Conference on Spoken Language Processing (InterSpeech/ICSLP'09),Sep 26-30,2009:Brighton,UK. 2009:248-251.
10Liu X,Gales M,Woodland P. Use of contexts in language model interpolation and adaptation. Proceedings of the 10th International Conference on Spoken Language Processing (InterSpeech/ICSLP'09),Sep 26-30,2009:Brighton,UK. 2009:360-363.

共引文献19

1李生,赵铁军.Chinese Information Processing and Its Prospects[J].Journal of Computer Science & Technology,2006,21(5):838-846. 被引量：1
2汤霖,尹俊勋.普通话声调的客观评测[J].中文信息学报,2007,21(6):116-124. 被引量：4
3刘赵杰,邵健,张鹏远,赵庆卫,颜永红,冯稷.汉语自然口语中声调识别的研究[J].物理学报,2007,56(12):7064-7069. 被引量：5
4黄浩,朱杰.汉语语音识别中基于区分性权重训练的声调集成方法[J].声学学报,2008,33(1):1-8. 被引量：2
5HUANG Hao ZHU Jie.Tone model integration based on discriminative weight training for Putonghua speech recognition[J].Chinese Journal of Acoustics,2008,27(3):193-202.
6袁里驰.基于改进的隐马尔科夫模型的语音识别方法[J].中南大学学报（自然科学版）,2008,39(6):1303-1308. 被引量：19
7袁里驰.Improved hidden Markov model for speech recognition and POS tagging[J].Journal of Central South University,2012,19(2):511-516. 被引量：4
8晁浩,杨占磊,刘文举.汉语语音识别中基于音节的声学模型改进算法[J].计算机应用,2013,33(6):1742-1745. 被引量：1
9晁浩,杨占磊,刘文举.基于发音特征的汉语声调建模方法及其在汉语语音识别中的应用[J].计算机应用,2013,33(10):2939-2944. 被引量：2
10晁浩,杨占磊,刘文举.汉语语音识别中声学界标点引导的随机段模型解码算法[J].计算机科学,2013,40(10):208-212. 被引量：1

同被引文献7

1岳源,张清芳.汉语口语产生中音节和音段的促进和抑制效应[J].心理学报,2015,47(3):319-328. 被引量：10
2黄浩,徐海华,王羡慧,吾守尔.斯拉木.自动发音错误检测中基于最大化F1值准则的区分性特征补偿训练算法[J].电子学报,2015,43(7):1294-1299. 被引量：8
3张庆芳,赵鹤鸣,龚呈卉.基于因子分析和特征映射的耳语说话人识别[J].数据采集与处理,2016,31(2):362-369. 被引量：3
4唐郅,侯进.基于深度神经网络的语音驱动发音器官的运动合成[J].自动化学报,2016,42(6):923-930. 被引量：6
5杜先娜,俞一彪.有效频带多分辨率特征提取及说话人年龄识别[J].信号处理,2016,32(9):1101-1107. 被引量：4
6张圣,郭武.采用通用语音属性建模的说话人确认[J].小型微型计算机系统,2016,37(11):2577-2581. 被引量：2
7张少白,陈燕俐,何利文.基于DIVA模型的中文复合元音发音方法研究[J].系统仿真学报,2017,29(2):255-263. 被引量：2

引证文献2

1王兴刚.英文发音中错误语音自动识别系统设计[J].现代电子技术,2018,41(10):179-182. 被引量：2
2张瑞华.英文语音纠错自动识别系统设计与实现[J].自动化技术与应用,2019,38(10):170-172. 被引量：2

二级引证文献4

1米婧.英语语音优化识别建模仿真分析[J].信息技术,2019,43(6):91-95. 被引量：6
2温玉华.基于DTW算法的英语发音错误自动校正系统设计[J].现代电子技术,2020,43(10):124-126. 被引量：9
3彭晓风,徐宏亮.基于音视频特征的多模态英语发音纠错模型研究[J].皖西学院学报,2023,39(3):123-129. 被引量：1
4千颖利.基于长短期记忆网络的英语标题自动生成[J].自动化技术与应用,2024,43(4):71-73.

1晁浩,杨占磊,刘文举.基于发音特征的汉语声调建模方法及其在汉语语音识别中的应用[J].计算机应用,2013,33(10):2939-2944. 被引量：2
2晁浩,杨占磊,刘文举.汉语语音识别中融合发音信息的随机段模型研究[J].计算机应用研究,2014,31(11):3365-3368. 被引量：1
3晁浩,刘志中,薛霄.汉语语音识别中融合发音信息的随机段模型研究[J].计算机应用研究,2015,32(4):1087-1090. 被引量：1
4张晴晴,潘接林,颜永红.基于发音特征的汉语普通话语音声学建模[J].声学学报,2010,35(2):254-260. 被引量：14
5晁浩,宋成,彭维平.基于发音特征的声效相关鲁棒语音识别算法[J].计算机应用,2015,35(1):257-261. 被引量：8
6王欢良,钱瑶,F.K.Soong,韩纪庆.基于声调建模的带噪汉语数字串语音识别[J].声学学报,2007,32(5):454-460. 被引量：2
7吴鹏,蒋冬梅,王风娜,Hichem SAHLI,Werner VERHELST.基于发音特征的音视频融合语音识别模型[J].计算机工程,2011,37(22):268-269. 被引量：2
8王新波,陈立潮,谢斌红,夏玫.基于FIHC的OAI-PMH服务提供者框架[J].计算机工程,2010,36(3):263-265. 被引量：2
9宋培岩,蒋冬梅,王风娜.基于发音特征的音/视频双流语音识别模型[J].计算机应用研究,2009,26(7):2481-2483. 被引量：1
10王风娜,蒋冬梅,宋培岩.结合发音特征的动态贝叶斯网络语音识别模型[J].计算机工程与应用,2009,45(8):178-181.

计算机工程与应用

2014年第23期

浏览历史

内容加载中请稍等...

语音识别中基于发音特征的声调集成算法被引量：2

参考文献15

二级参考文献56

共引文献19

同被引文献7

引证文献2

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

语音识别中基于发音特征的声调集成算法 被引量：2

参考文献15

二级参考文献56

共引文献19

同被引文献7

引证文献2

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

语音识别中基于发音特征的声调集成算法被引量：2