基于发音特征的汉语声调建模方法及其在汉语语音识别中的应用被引量：2

Improved tone modeling by exploiting articulatory features for Mandarin speech recognition

下载PDF

导出

摘要发音特征表征了语音的发音方式信息,能够辅助传统的韵律特征改善声调建模的精度。在分析汉语声韵母发音特点的基础上,将发音方式划分为19类,并提出利用阶层式多层感知器计算语音信号属于各类的后验概率,作为发音特征。之后,将发音特征与传统的韵律特征一起用于声调建模。实验结果显示,加入发音特征后,在三种不同的建模方法下声调识别的准确率提升约5%。将声调模型融入大词表连续语音识别系统后,汉字错误率有了明显的下降。 Articulatory features, which represent the articulatory information, can help prosodic features to improve the performance of tone recognition. In this paper, a set of 19 pronunciation categories was given according to the pronunciation eharaeterlstics of initials and finals. Besides, 19 articulatory tandem features, which are the posteriors of speech signal belonging to the 19 pronunciation categories, were obtained by hierarchical muhilayer perceptron classifiers. Then these articulatory tandem features, as well as prosodic features, were used for tone modeling. Tone recognition experiments of three kinds of tone models indicate that about 5% absolute increase of accuracy can be achieved when using both articulatory features and prosodic features. When the proposed tone model is integrated into LVSCR （ Large Vocabulary Continuous Speech Recognition） system, the character error rate is reduced significantly.

作者晁浩杨占磊刘文举

机构地区河南理工大学计算机科学与技术学院模式识别国家重点实验室(中国科学院自动化研究所)

出处《计算机应用》 CSCD 北大核心 2013年第10期2939-2944,共6页 journal of Computer Applications

基金国家自然科学基金资助项目(91120303 90820303 90820011)

关键词语音识别声调建模发音特征阶层式多层感知机分类器 speech recognition tone modeling Articulatory Feature （AF） Hierarchical muhilayer perceptron classifier

分类号 TP391.42 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献18

1HUANG H, LI B H. Automatic context induction for tone model in- tegration in Mandarin speech recognition [ J]. Journal of China Uni- versities of Posts and Telecommunications, 2012, 19(1) : 94 - 100.
2黄浩,朱杰.汉语语音识别中基于区分性权重训练的声调集成方法[J].声学学报,2008,33(1):1-8. 被引量：2
3NI C J, LIU W J, XU B. Using prosody to improve Mandarin auto- matic speech recognition[ C]//Proceedings of the 11th Annum Con- ference of the International Speech Communication Association. Makuhari: ISCA, 2010:2690 - 2693.
4TIAN Y, JIA J, WANG Y X, et al. A real-time tone enhancement method for continuous Mandarin speechs [ C]// Proceedings of the 8th International Symposium on Chinese Spoken Language Process- ing. Piscataway: IEEE, 2012:405 -408.
5LEI X, OSTENDORF M. Word level tone modeling for Mandarin speech recognition [ C] // Proceedings of the 32th IEEE Internation- al Conference on Acoustics, Speech, and Signal Processing. Piscat- away: IEEE, 2007:665-668.
6YANG W J, LEE J C, CHANG Y C, et al. Hidden Markov model for Mandarin lexical tone recognition[ J]. IEEE Transactions on A- coustic Speech and Signal Processing, 1988, 36(7):988 -992.
7THUBTHONG N, KIJSIRIKUL B. Tone recognition of continuous Thai speech under tonal assimilation and declination effects using half-tone model[ J]. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2001, 9(6) : 815 -825.
8曹阳,黄泰翼,徐波.基于统计方法的汉语连续语音中声调模式的研究[J].自动化学报,2004,30(2):191-198. 被引量：9
9PENG G, WANG W S. Tone recognition of continuous Cantonese speech based on support vector machines [ J]. Speech Communica- tion, 2005, 45(1): 49-62.
10WANG X H, YU Y S, WU X H. Maximum entropy based tone modeling for Mandarin speech recognition[ C]//Proceedings of the 35th IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2010:4850-4853.

二级参考文献31

1王韫佳.音高和时长在普通话轻声知觉中的作用[J].声学学报,2004,29(5):453-461. 被引量：31
2Huang Tai-Yi, Wang Cai-Fei, Yoh-Han Pao. Speech analysis for Chinese putonghua(Mandarin). In: Proceedings of IEEE International Conference on Acoustic, Speech, and Signal Processing. Atlanta: IEEE Press, 1981.1 : 370- 373.
3Wu Zong-Ji. Tone variation in Chinese language. Chinese Language, 1982,28(6): 439-449(in Chinese).
4Lee Lin-Shan, Tseng Chiu-Yu, Ming Ouh-Young. The synthesis rules in Chinese text-to-speech system. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1989,37(9) : 1309- 1320.
5Lee Lin-Shan, Tseng Chiu-Yu, Hsieh Ching-Jiang. Improved tone concatenation rules in a formant-based Chinese text-to-speech system. IEEE Transactions on Speech and Signal Processing, 1993,1 (3) : 287 -294.
6Wang Hsin-Min, Ho Tai-Hsuan, Yang Rung-Chiung et al. Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary but limited training data. IEEE Transactions on Speech and Audio Processing, 1997,5(2) : 195-200.
7Cao Yang, Huang Tai-Yi, Xu Bo, Li Cheng-Rong. A stochastic polynomial tone model for continuous Mandarin speech. In: Proceedings of International Conference on Spoken Language Processing. Beijing: 2000.3:674-677.
8Bahl L R, de Souza P V, Gopalakrishnan P S, Nahamoo D, Picheny M A. Decision tree for phonological rules in continuous speech. In: Proceedings of IEEE International Conference on Acoustic, Speech, and Signal Processing. Glasgow, Scotland: 1989.1:185-188.
9Huang Tai-Yi, Wang Cai-Fei, Yoh-Han Pao. Speech analysis for Chinese putonghua(Mandarin). In: Proceedings of IEEE International Conference on Acoustic, Speech, and Signal Processing. Atlanta: IEEE Press, 1981.1:370～373
10Wu Zong-Ji. Tone variation in Chinese language. Chinese Language,1982,28(6): 439-449(in Chinese)

共引文献9

1汤霖,尹俊勋.普通话声调的客观评测[J].中文信息学报,2007,21(6):116-124. 被引量：4
2刘赵杰,邵健,张鹏远,赵庆卫,颜永红,冯稷.汉语自然口语中声调识别的研究[J].物理学报,2007,56(12):7064-7069. 被引量：5
3黄浩,朱杰.汉语语音识别中基于区分性权重训练的声调集成方法[J].声学学报,2008,33(1):1-8. 被引量：2
4HUANG Hao ZHU Jie.Tone model integration based on discriminative weight training for Putonghua speech recognition[J].Chinese Journal of Acoustics,2008,27(3):193-202.
5侯丽敏,黄振华,谢娟敏.声门下共鸣的谱规整用于非特定人的语音识别[J].声学学报,2010,35(5):580-586.
6易雪蓉,黄巍,胡迪,蒋怡.基于HMM的声调语音模型研究[J].武汉工程大学学报,2018,40(6):691-695. 被引量：2
7晁浩,宋成,刘志中.语音识别中基于发音特征的声调集成算法[J].计算机工程与应用,2014,50(23):21-25. 被引量：2
8沈凌洁,王蔚.基于融合特征的短语音汉语声调自动识别方法[J].声学技术,2018,37(2):167-174. 被引量：3
9谢淑玮,司徒俊鸿.基于安卓平台的智能汉语口语学习应用设计研究[J].科学与信息化,2020(10):45-45.

同被引文献21

1陈临强,李云霞,沈俊.联机手写数学公式识别系统的设计和实现[J].杭州电子科技大学学报（自然科学版）,2009,29(2):36-39. 被引量：2
2杜利民,谢凌云,刘斌.HMM非特定人连续语音识别的嵌入式实现[J].电子与信息学报,2005,27(1):60-63. 被引量：6
3王宏,郭艳丽,贾新民.基于HMM的孤立字识别[J].昌吉学院学报,2006(1):94-98. 被引量：3
4陈钦梧,郝元礼,邱树业.汉字笔画编码输入法研究[J].汕头大学学报（自然科学版）,2007,22(2):71-75. 被引量：7
5陈海丰,徐森森,汪建龙.一种基于手势识别的盲文输入法:江苏,CN102929394A[P].2013-02-13.
6袁里驰.基于改进的隐马尔科夫模型的语音识别方法[J].中南大学学报（自然科学版）,2008,39(6):1303-1308. 被引量：19
7倪崇嘉,刘文举,徐波.汉语大词汇量连续语音识别系统研究进展[J].中文信息学报,2009,23(1):112-123. 被引量：38
8祝常健,胡维平,叶佳宁.基于HMM语音识别技术在ARM平台的实现[J].微计算机信息,2009(35):143-145. 被引量：4
9刘刚,陈伟,郭军.汉语连续语音识别结果评价算法研究[J].China Communications,2010,7(2):132-138. 被引量：3
10韩国华,耿守本,李风波,马红旗.基于PLC的触摸屏中文拼音输入法设计[J].工业仪表与自动化装置,2010(3):82-85. 被引量：4

引证文献2

1张居晓,曾晓勤,孟朝晖.盲用多点触摸输入法的设计与实现[J].计算机应用与软件,2015,32(10):231-235.
2闻静.基于HMM的非特定人汉语语音识别系统[J].中国工程机械学报,2014,12(5):466-470. 被引量：2

二级引证文献2

1戚龙,赵丹.基于BP神经网络的非特定人语音识别算法[J].科学技术与工程,2017,17(31):277-282. 被引量：12
2郝雯超,冯月芹,李春光,陈义.基于嵌入式平台的实用语音识别研究[J].电子器件,2018,41(1):110-114. 被引量：6

1晁浩,宋成,刘志中.语音识别中基于发音特征的声调集成算法[J].计算机工程与应用,2014,50(23):21-25. 被引量：2
2晁浩,杨占磊,刘文举.汉语语音识别中融合发音信息的随机段模型研究[J].计算机应用研究,2014,31(11):3365-3368. 被引量：1
3晁浩,刘志中,薛霄.汉语语音识别中融合发音信息的随机段模型研究[J].计算机应用研究,2015,32(4):1087-1090. 被引量：1
4晁浩,宋成,彭维平.基于发音特征的声效相关鲁棒语音识别算法[J].计算机应用,2015,35(1):257-261. 被引量：8
5秦春香,黄浩.发音特征在维汉语音识别中的应用[J].计算机工程,2012,38(23):177-180.
6张遥,王群.安全灵活的VLAN技术[J].网管员世界,2005(2):89-91.
7陈金凤,杨慧中,邓玉俊.一种基于LDA和FCM的BPA多模型软测量方法[J].华东理工大学学报（自然科学版）,2010,36(1):126-129. 被引量：1
8殷雯雯,马建威,陈洪辉.指控信息特征的分析与捕获方法[J].火力与指挥控制,2014,39(12):23-26. 被引量：1
9吴鹏,蒋冬梅,王风娜,Hichem SAHLI,Werner VERHELST.基于发音特征的音视频融合语音识别模型[J].计算机工程,2011,37(22):268-269. 被引量：2
10王新波,陈立潮,谢斌红,夏玫.基于FIHC的OAI-PMH服务提供者框架[J].计算机工程,2010,36(3):263-265. 被引量：2

计算机应用

2013年第10期

浏览历史

内容加载中请稍等...

基于发音特征的汉语声调建模方法及其在汉语语音识别中的应用被引量：2

参考文献18

二级参考文献31

共引文献9

同被引文献21

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于发音特征的汉语声调建模方法及其在汉语语音识别中的应用 被引量：2

参考文献18

二级参考文献31

共引文献9

同被引文献21

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于发音特征的汉语声调建模方法及其在汉语语音识别中的应用被引量：2