摘要
提出了面向语音关键词检测的多尺度声学模型建模框架,基于判决树的自动音素聚类生成了大尺度音素集,利用HMM声学模型训练技术生成了大尺度音素声学上下文相关的背景模型,提高了废料语音的建模精度,还给出了此框架下共享HMM状态的高效搜索空间构造方法,关键词识别准确率平均提高了绝对6.9%;提出了近邻声学上下文准则以及候选关键词在多尺度声学模型上的似然比计算方法并采用FLDA融合,显著提高了声学置信度计算的有效性,系统等错率绝对下降了3.0%。
A multiple scale acoustic modeling framework for task-domain independent keyword spotting was proposed, A large-scale phoneme set was obtained automatically through decision-tree based phoneme clustering, and a large-scale phoneme acoustic context dependent background model was trained accordingly through using standard HMM training framework. The modeling accuracy for filler speech is improved. Under the framework, an efficient searching space construction through using shared HMM state was also described. Experimental results showed that in average absolute 6.9% improvement of keyword recognition accuracy could be obtained. Furthermore an acoustic context neighbor algorithm to measure acoustic confidence and a method of computing candidate keyword likelihood based on proposed multiple-scale acoustic model were proposed and a fusing method based on FLDA was adopted. The effectiveness of acoustic confidence measure is improved significantly, Experimental results showed that absolute 3.0% reduction ,of equal error rate could be obtained.
出处
《通信学报》
EI
CSCD
北大核心
2006年第2期137-141,共5页
Journal on Communications
关键词
声学置信度
多尺度声学建模
搜索空间
acoustic confidence measure
multiple scale acoustic modeling
searching space