区分性训练在声纹密码中的新应用

Novel Application of Discriminative Training in Vocal Password

下载PDF

导出

摘要在声纹密码任务中由于数据稀疏的问题难以实现区分性训练,本文以一种表征距离度量的特征矢量为基础提出新的声纹密码区分性系统框架,对正反例样本的新特征矢量实现了基于最小分类错误准则的区分性训练,将声纹密码从确认问题转化为二类分类问题。在自由说话风格的60人数据集上,声纹密码区分性系统与混合高斯模型-通用背景模型(Gaussian mixture model-universal background model,GMM-UBM)系统融合后等错误率为4.48%,相对GMM-UBM,动态时间规划(Dynamic time warping,DTW)基线系统性能分别提升了17.95%和59.68%。 Due to data sparsity, discriminative training has not been successfully applied to the system of vocal password up to now. Therefor, a novel vocal password framework based on a specific pre-processing strategy is proposed. The new feature is used to represent the distance measure and the problem caused by data sparsity can be solved to some extent. As a consequence, the vocal password is actually transferred from verification to binary classification and the discriminative training of two class models is sueeessfully accomplished on the minimum classification error criteria. After fusing the discriminative system with Gaussian mixture mod- el-universal background model（GMM-UBM） system, the equal error rate （EER） performance decreases to 4.48%, relatively 17.95% and 59.68% lower than the GMM-UBM and the dynamic time warping（DTW） system respectively on the corpus including 60 speakers. The experiment results show that the new application of discriminative training in the vocal password system is feasible and effective.

作者潘逸倩胡国平戴礼荣刘庆峰

机构地区中国科学技术大学讯飞语音实验室安徽科大讯飞信息科技股份有限公司

出处《数据采集与处理》 CSCD 北大核心 2012年第4期404-409,共6页 Journal of Data Acquisition and Processing

基金安徽省科技攻关(09120201003)资助项目

关键词声纹密码说话人确认区分性训练 GMM—UBM vocal password speaker verification discriminative training GMM-UBM

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献13

1Li Q, Juang B H, Zhou Q, et al. Automatic verbal information verification for user authentication [J]. IEEE Trans on Speech and Audio Processing, 2000, 8(5) : 585-596.
2Ramasubramanian V, Das A, Kumar V P. Text-de- pendent speaker-recognition using one-pass dynamic programming algorithm [C]//Proceedings of ICAS- SP' 06. Toulouse, France.- IEEE, 2006.. 901-904.
3Subramanyal A, Zheng Z, Surendran A C, et al. A generative framework using ensemble methods for text-dependent speaker verification[C]//Proceedings of ICASSP' 07. Dallas, Texas, USA: IEEE, 2007: 225-228.
4Li S Z, Zhang D, Ma C, et al. Learning to boost GMM based speaker verification [C]//Proceedings of EuroSpeech 2003. Geneva, Switzerland: ISCA, 2003: 1677-1680.
5Campbell W M, Sturim D E, Reynolds D A, et al. SVM based speaker verification using a GMM super- vector kernel and nap variability compensation [C]// Proceedings of ICASSP' 06. Toulouse, France: IEEE, 2006: 97-100.
6Kenny P, Boulianne G, Ouellet P, et al. Speaker and session variability in GMM based speaker verifi- cation[J]. IEEE Trans on Audio, Speech and Lan- guage Processing, 2007, 15(4):1448-1460.
7Kenny P, Boulianne G, Dumouchel P. Joint factor analysis versus eigen channels in speaker recognition[J]. IEEE Trans on Audio, Speech and Language Processing, 2007, 15 (4): 1435-1447.
8Bilmes J A. A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaus- sian mixture and hidden Markov models[R]. Tech- nical Report ICSI-TR-97-021, University of Califor- nia Berkeley, USA: 1997.
9Povey D, Woodland P C. Minimum phone error and I-smoothing for improved discriminative training [C]//Proceedings of ICASSPr 02. Orlando, FL, USA: IEEE, 2002.- 105-108.
10Juang B H, Hou W, Lee C H. Minimum classifica- tion error rate methods for speech recognition [J]. IEEE Trans on Speech and Audio Processing, 1997, 5(3): 257-265.

1万龙静,刘刚.语音识别中最小音素错误特征训练的研究[J].软件,2013,34(12):51-55. 被引量：1
2陈斌,牛铜,张连海,李弼程,屈丹.声学模型区分性训练中的动态加权数据选取方法[J].自动化学报,2014,40(12):2899-2907.
3杨润辉,吴清江.基于步态的身份识别综述[J].电脑开发与应用,2007,20(9):30-32. 被引量：1
4霍春宝,张彩娟,赵红敏.基于GMM-UBM的说话人确认系统的研究[J].辽宁工业大学学报（自然科学版）,2012,32(2):98-101.
5罗兵,周贤善.计算机系统安全及其模拟教学实验[J].荆门职业技术学院学报,2002,17(3):53-57. 被引量：1
6孟君,杨大利.说话人辨认中通用背景模型训练时长研究[J].北京信息科技大学学报（自然科学版）,2013,28(3):87-91. 被引量：4
7单振宇,杨莹春.基于UBM降阶算法的高效说话人识别系统[J].浙江大学学报（工学版）,2009,43(6):978-982.
8张陈昊,郑方,王琳琳.基于多音素类模型的文本无关短语音说话人识别[J].清华大学学报（自然科学版）,2013,53(6):813-817. 被引量：1
9鲍焕军,郑方.GMM-UBM和SVM说话人辨认系统及融合的分析[J].清华大学学报（自然科学版）,2008,48(S1):693-698. 被引量：9
10陈雷,杨俊安,王一,王龙.LVCSR系统中一种基于区分性和自适应瓶颈深度置信网络的特征提取方法[J].信号处理,2015,31(3):290-298. 被引量：9

数据采集与处理

2012年第4期

浏览历史

内容加载中请稍等...

区分性训练在声纹密码中的新应用

参考文献13

相关作者

相关机构

相关主题

浏览历史