摘要
每个汉字的发音都是由声母、韵母两部分构成的。声母部分发音时间短,信号变化剧烈;而韵母部分发音时间长,信号相对比较平稳。传统的孤立字识别方案是以线性预测系数作为语音模型系数,用动态时间弯折算法进行模式匹配,但它不完全适用于汉语的单音节识别。本文中利用语音信号相邻帧间LPC距离的变化进行声母、韵母分割,并根据声母、韵母的不同特性分别建立模式,提高了声母部分在整个音节模式中的比重,同时大幅度降低了模式的数据量。实验结果表明,汉语单音节的识到速度较传统的LPC/CTW算法提高一倍以上,识别正确率达到95%
Speech recognition is a fast-growing field of research. Though oral Chinese has its
own peculiarities, there is still much in common with oral forms of other languages. A
Chinese syllable is composed of initial consonant and compound vowel. The initial conso-
nant part or a syllable is short-durationed and the speech waveform fluctuates sharply. The
compound vowel part is long and stable compared with the initial consonant part. Itakura's
conventional isolated word recognition method is Linear Predictive Coefficients (LPC)
based, using Dynamic Time-Warping (DTW) algorithm for registering test and reference
patterns[1], but it is not completely applicable to Chinese monosyllable recognition. In this
paper, we modify Itakura's method and then present an approach that is believed to be bet-
ter applicable to Chinese monosyllable recognition. In our approach, a syllable is so sepa-
rated into initial consonant part and compound vowel part as to be in accord with the vari-
ation of the LPC distances between the speech signal's adjacent frames. Initial consonant
part and compund vowel part are modelled separately on the basis of their own characteris-
tics. We form a Chinese monosyllable pattern with less LPC vectors because we use several
selected LPC vectors to indicate compound vowel part. Thus, proportion of the initial con-
sonant part in a Chinese syllable pattern is increased, and the storage requirement of the
new speech pattern is decreased sharply. The results show that the approach presented re-
duces computation by 50% as compared with the conventional LPC / DTW recognizer of
Itakura, and the recognition accuracy is 95 percent, which is considerably higher than that
obtainable with Itakura's method.
出处
《西北工业大学学报》
EI
CAS
CSCD
北大核心
1992年第2期174-180,共7页
Journal of Northwestern Polytechnical University
基金
航空科学基金资助项目
关键词
诘音识别
汉语
声母
韵母
计算机
Speech recognition
Itakura's conventional isolated word recognition method
Chinese monosyllable recognition