摘要
研究了将自适应领域的最大似然线性回归(Maximum likelihood linear regression,MLLR)变换矩阵作为特征进行文本无关的说话人识别算法.本文引入了基于统一背景模型的MLLRSV-SVM说话人识别算法,并在此基础上进行高层音素聚类以进一步提高识别性能.在采用多种信道补偿技术后,在NISTSRE2006年1训练语段-1测试语段同信道和跨信道数据库上,基于MLLR特征的系统与其他最好的系统性能接近并有很强的互补性,经过简单线性融合可以极大提高识别性能.
This paper uses the maximum likelihood linear regression (MLLR) as feature for text-independent speaker recognition algorithm. We introduce a universal background model (UBM) based MLLRSV-SVM algorithm first, and then extend the algorithm to multi-class for improvement. After channel compensation, in terms of the NIST 2006 SRE lconv4w-lconv4w/mic corpus, the MLLR based system is comparable with and complementary of the state of the art systems. The performance is greatly improved by simply linear fusion.
出处
《自动化学报》
EI
CSCD
北大核心
2009年第5期546-550,共5页
Acta Automatica Sinica
基金
国家高技术研究发展计划(863计划)(2006AA010101
2007AA04Z223)
国家自然科学基金委员会与微软亚洲研究院联合资助项目(60776800)资助~~
关键词
说话人识别
最大似然线性回归
支持向量机
信道补偿
Speaker recognition, maximum likelihood linear regression (MLLR), support vector machine (SVM), channel compensation