摘要
为了挖掘更多语种间区分性信息进行可靠的自动语种识别,本文提出一种将自适应领域的最大似然线性回归(maximum likelihood linear regression,MLLR)矩阵作为特征的语种识别算法。该算法首先对每个语种训练Gauss混合模型(Gaussian mixture model,GMM),然后对每个语音段在所有语种的GMM上计算MLLR矩阵。将得到的多类MLLR矩阵经归一化后拼接形成超矢量作为特征输入支持向量机(support vector machine,SVM)分类器进行训练和识别。比较了均值方差和排序两种归一化方法,并将多类MLLR-SVM算法与传统GMM语种识别算法进行对比。实验表明:排序归一化算法优于传统的均值方差归一化;建立在GMM模型基础上的MLLR-SVM系统性能有9.7%的提升,并与GMM分类器有很强的互补性。
This paper presents a language identification algorithm based on maximum likelihood linear regression(MLLR).The algorithm first trains the language dependent Gaussian mixture models(GMMs),calculates the MLLR transforms for every speech segment from the GMMs,and then combines the MLLRs to form supervectors for support vector machine(SVM) classifier training and testing after normalization.Tests comparing mean/variance normalization with rank normalization and the current MLLR-SVM system with the GMM classifier show that rank normalization outperforms the traditional mean/variance normalization With the MLLR-SVM system 9.7% better than the GMM classifier,but can complement the GMM classifier results.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2009年第S1期1283-1287,共5页
Journal of Tsinghua University(Science and Technology)
基金
国家自然科学基金资助项目(60776800)
国家"八六三"高技术项目(2006AA010101
2007AA04Z223
2008AA02Z414)