摘要
在文本无关的说话人识别中,韵律特征由于其对信道环境噪声不敏感等特性而被应用于话者识别任务中。本文对韵律参数采用基于高斯混合模型超向量的支持向量机建模方法,并将类内协方差特征映射方法应用于模型超向量上,单系统的性能比传统方法的混合高斯-通用背景模型(Gaussian mixture model-universalbackground model,GMM-UBM)基线系统有了40.19%的提升。该方法与本文的基于声学倒谱参数的确认系统融合后,能使整体系统的识别性能有9.25%的提升。在NIST(National institute of standards and technology mixture)2006说话人测试数据库上,融合后的系统能够取得4.9%的等错误率。
In the text-independent speaker recognition system, prosodic features are widely used to verify the speaker identity because they are less sensitive to the channel and noisy effect than cepstral ones. This paper proposes a verification method, called the prosody Gaussian co- variance projection-support vector machine (PGCP-SVM). The method is based on the pitch and the energy, and their dynamic features. Different from the conventional techniques, target speaker models are modeled by support vector machine (SVM) based on the Gaussian mixture model (GMM) mean-supervectors. Particullarly, the within-class covariance projection tech- nique is used to the mean-supervectors. The projection approach can improve the prosodic sys- tem performance. Combined with the acoustic Mel-frequency cepstral coefficient system, the performance is improved by 9. 25%. In the NIST 2006 SRE corpus, the equal error rate (EER) of the combined system can reach 4.9%.
出处
《数据采集与处理》
CSCD
北大核心
2010年第1期76-80,共5页
Journal of Data Acquisition and Processing
基金
NSFC-微软亚洲研究院联合资助项目
国家自然科学基金(60970161)资助项目
关键词
说话人确认
韵律特征
超向量
类内协方差
speaker verification
prosodic features
supervector
within-class covariance projection