摘要
声道归一化是语音识别中说话人自适应的方法之一,在噪声环境下对其进行了研究并做了一系列的实验。在实现过程中,首次在噪声环境下采用了基于单高斯混合模型选择弯折因子的方法,并取得了良好的结果。实验基于AURORA语音数据库,并用其所带的汽车噪声环境下的测试集对模型进行了识别验证。实验结果表明,采用声道归一化后的识别结果在各个噪声下均比原来有不同程度的改善,迭代训练能改进单轮声道归一化的结果,最佳结果出现在迭代训练的第三轮。噪声环境下基于一个高斯混合模型选择的弯折因子相比其他高斯混合模型选择的弯折因子,句子平均识别率提高了近1.68%。经过声道归一化后的性别独立模型的识别结果能接近于未经声道归一化后的性别依赖模型的识别结果,如果训练数据充分,声道归一化后的性别独立模型的识别结果能更好。
Vocal tract length normalization is one of speaker adaptation in speech recognition.In this paper,we focus on the study of it and do a series of experiments.In its realization,we firstly adopt the means on scale factor which is based on single mixture in noisy environment and reach the better result.The experiments are based on AURORA speech database.We recognize the models using the test set in noisy car environment which is included in AURORA speech database.The results show that in various noise the recognized results of the VTLN are better than those of no VTLN.Iterative training can improve the performance of single turn VTLN and the optimal result is in third turns.In noisy environment,the average sentence correction based on the scale factor of single mixture is improved more 1.68 percent than that of the other mixtures. The gender independent performance of no VTLN is close to the gender dependent performance of VTLN.If the training data is sufficent,the gender independent performance of VTLN is better.
出处
《微处理机》
2006年第5期102-105,共4页
Microprocessors
关键词
声道归一化
语音识别
说话人自适应
Vocal tract Length normalization
Speech recognition
Speaker adaptation