摘要
发音特征表征了语音的发音方式信息,能够辅助传统的韵律特征改善声调建模的精度。在分析汉语声韵母发音特点的基础上,将发音方式划分为19类,并提出利用阶层式多层感知器计算语音信号属于各类的后验概率,作为发音特征。之后,将发音特征与传统的韵律特征一起用于声调建模。实验结果显示,加入发音特征后,在三种不同的建模方法下声调识别的准确率提升约5%。将声调模型融入大词表连续语音识别系统后,汉字错误率有了明显的下降。
Articulatory features, which represent the articulatory information, can help prosodic features to improve the performance of tone recognition. In this paper, a set of 19 pronunciation categories was given according to the pronunciation eharaeterlstics of initials and finals. Besides, 19 articulatory tandem features, which are the posteriors of speech signal belonging to the 19 pronunciation categories, were obtained by hierarchical muhilayer perceptron classifiers. Then these articulatory tandem features, as well as prosodic features, were used for tone modeling. Tone recognition experiments of three kinds of tone models indicate that about 5% absolute increase of accuracy can be achieved when using both articulatory features and prosodic features. When the proposed tone model is integrated into LVSCR ( Large Vocabulary Continuous Speech Recognition) system, the character error rate is reduced significantly.
出处
《计算机应用》
CSCD
北大核心
2013年第10期2939-2944,共6页
journal of Computer Applications
基金
国家自然科学基金资助项目(91120303
90820303
90820011)
关键词
语音识别
声调建模
发音特征
阶层式多层感知机分类器
speech recognition
tone modeling
Articulatory Feature (AF)
Hierarchical muhilayer perceptron classifier