基于汉语声母韵母发音模型的语音识别被引量：1

An Improved Speech Recognition Method for Chinese Syllable

导出

摘要每个汉字的发音都是由声母、韵母两部分构成的。声母部分发音时间短,信号变化剧烈;而韵母部分发音时间长,信号相对比较平稳。传统的孤立字识别方案是以线性预测系数作为语音模型系数,用动态时间弯折算法进行模式匹配,但它不完全适用于汉语的单音节识别。本文中利用语音信号相邻帧间LPC距离的变化进行声母、韵母分割,并根据声母、韵母的不同特性分别建立模式,提高了声母部分在整个音节模式中的比重,同时大幅度降低了模式的数据量。实验结果表明,汉语单音节的识到速度较传统的LPC/CTW算法提高一倍以上,识别正确率达到95% Speech recognition is a fast-growing field of research. Though oral Chinese has its own peculiarities, there is still much in common with oral forms of other languages. A Chinese syllable is composed of initial consonant and compound vowel. The initial conso- nant part or a syllable is short-durationed and the speech waveform fluctuates sharply. The compound vowel part is long and stable compared with the initial consonant part. Itakura's conventional isolated word recognition method is Linear Predictive Coefficients (LPC) based, using Dynamic Time-Warping (DTW) algorithm for registering test and reference patterns[1], but it is not completely applicable to Chinese monosyllable recognition. In this paper, we modify Itakura's method and then present an approach that is believed to be bet- ter applicable to Chinese monosyllable recognition. In our approach, a syllable is so sepa- rated into initial consonant part and compound vowel part as to be in accord with the vari- ation of the LPC distances between the speech signal's adjacent frames. Initial consonant part and compund vowel part are modelled separately on the basis of their own characteris- tics. We form a Chinese monosyllable pattern with less LPC vectors because we use several selected LPC vectors to indicate compound vowel part. Thus, proportion of the initial con- sonant part in a Chinese syllable pattern is increased, and the storage requirement of the new speech pattern is decreased sharply. The results show that the approach presented re- duces computation by 50% as compared with the conventional LPC / DTW recognizer of Itakura, and the recognition accuracy is 95 percent, which is considerably higher than that obtainable with Itakura's method.

作者鲍欣林其璈张英芳

机构地区西北工业大学

出处《西北工业大学学报》 EI CAS CSCD 北大核心 1992年第2期174-180,共7页 Journal of Northwestern Polytechnical University

基金航空科学基金资助项目

关键词诘音识别汉语声母韵母计算机 Speech recognition Itakura's conventional isolated word recognition method Chinese monosyllable recognition

分类号 TP391.42 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1朱雪龙，语音信号数字处理，1983年

同被引文献13

1贺前华,韦岗,徐秉铮.基于并行基因算法的语音识别方法[J].华南理工大学学报（自然科学版）,1996,24(10):64-71. 被引量：1
2Rabiner L, Rosenberg A, Levinson S. Consideration in dynamic time warping algorithms for discrete word recognition. IEEE Trans Acoust Speech Signal Process, 1987, 26 (12) : 575.
3Agrawal R, Faloutsos C, Swami A. Efficient similarity search in sequences database. In: Lomet D, ed. Proc. of the 4th Int' 1 Conf. of Foundations of Data Organization and Algorithms. Chicago: Springer-Verlag, 1993. 69 -84.
4Faloutsos C, Ranganathan M, blandoponlos Y. Fast subsequence matching in time-series databases. In: Snodgrass R, Winslett M, ed. Proc. of the' 94 ACM SIGMOD Conf. Mineapolis: ACM Press, 1994. 419-429.
5Berndt D J, Clifford J . Using dynamic time warping to find patterns in time series E C J. Proe of the AAAI Workshop on Knowledge Discovery in Database. Menlo Park, CA: AAAI, 1994: 359-370.
6Cheng Tao, Li Zhilin. A Muhi-scale Approach for Spatio- temporal Outlier Detection [ J]. Transactions in GIS, 2006,10 (2): 253-263.
7Li YM, Chen HG, Wu ZQ. Dynamic Time Warping Distance Method for Similarity Test of Muhipoint Ground Motion Field [J]. Mathematical Problems in Engineering, 2010, 749517: 1-12.
8陆薛妹,胡轶,方建安.基于分段极值DTW距离的时间序列相似性度量[J].微计算机信息,2007,23(27):204-206. 被引量：5
9李梅,白凤兰.基于DTW距离的DNA序列相似性分析[J].生物数学学报,2009,24(2):374-378. 被引量：11
10张浩,刘志镜.加权DTW距离的自动步态识别[J].中国图象图形学报,2010,15(5):830-836. 被引量：15

引证文献1

1唐勇,王江明,陈志坚,耿方方.DTW距离在潮汐河段桩顶应变监测异常识别中的应用[J].工程勘察,2011,39(7):78-80. 被引量：3

二级引证文献3

1黎丁,何炳泉,雷雄武,李悦,刘斯琴.动水作用下岛屿富砂地层搅拌桩施工关键技术探讨[J].广东土木与建筑,2011,18(10):3-5.
2谷伟铭,陈志坚.苏通大桥新旧混凝土界面监测分析[J].低温建筑技术,2014,36(1):106-108. 被引量：1
3任柯柯,陈南,曹小安,刘永慧.一种新型催化发光传感器阵列用于化妆水的识别[J].分析科学学报,2017,33(3):327-331.

1徐近霈,杨子云,沙宗先.基于声韵分割的汉语单音节识别方法[J].应用声学,1989,8(1):5-9.
2董文明,吴乐华,姜德雷.基于背景重构的运动目标检测算法[J].重庆邮电大学学报（自然科学版）,2008,20(6):754-757. 被引量：7
3袁江,乔佩利.基于模式比较的漏洞分析技术研究[J].信息技术,2008,32(3):17-20.
4王中贤,钱颂迪.揭示数据的内在规律[J].管理科学,1998,14(4):31-32.
5胡萍,胡德斌.一种利用ms补丁信息分析windows软件漏洞的方法[J].黑龙江科技信息,2008(30):80-80.
6李继连.动态时间弯折模式匹配技术(DTW)的算法改进研究[J].黑龙江科技信息,2009(28):83-83.
7张丽娟,王申良.基于STM32的语音识别系统的设计与实现[J].黑龙江科技信息,2011(2):21-21. 被引量：2
8陈觉之,张贵荣,周宇欢.基于GMM模型的自适应说话人识别研究[J].计算机与现代化,2013(7):91-93. 被引量：2
9宋秋侠.汉语拼音学习＂四到＂——“眼”、“手”、“口”、“耳”[J].飞（素质教育）,2013(9):137-137.
10邹本存,储江伟,张丽莉.汽车融合故障诊断中D-S证据理论基本可信度的研究[J].林业机械与木工设备,2011,39(8):36-38.

西北工业大学学报

1992年第2期

浏览历史

内容加载中请稍等...

基于汉语声母韵母发音模型的语音识别被引量：1

参考文献1

同被引文献13

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于汉语声母韵母发音模型的语音识别 被引量：1

参考文献1

同被引文献13

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于汉语声母韵母发音模型的语音识别被引量：1