摘要
语音听写机中语音、语言模型是两个非常重要的部分,而语音模型的好坏直接影响语言模型和听写机的性能。文中在一个大型数据库上对语音识别基元、语音模型、模型的输出观察向量的计分方法进行了大量的比较实验。实验表明,采取以音节为识别基元、基于中心距离正态分布的中心距离连续概率模型,和基于最近邻原则的输出观察向量计分方法即嵌入式多模板方案,可以取得很好的识别效果。
The speech recognition model and the Language model are two extremely important components in the Chinese dictation machine, the performance of the Language model and the dictation machine will be affected directly by that of the speech model. A great deal of experiments on speech recognition units, speech recognition models and the forms of scoring methods for output observation vectors have been done based on a giant speech corpus. The result is that best performance can be achieved while choosing the syllable as the speech recognition unit, using CDN (center distance normal ) distribution based CDCPM (center distance continuous probability model), and adopting NN ( nearest neighbor ) based scoring scheme, i.e., the embedded multi model (EMM) scheme.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
1997年第9期37-40,共4页
Journal of Tsinghua University(Science and Technology)
基金
国家"八六三"高技术项目
关键词
嵌入式多模板
语音识别模型
语音听写机
汉语
center distance normal distribution
center distance continuous probability model
NN based scoring scheme
embedded multi model