采用模型和得分非监督自适应的说话人识别被引量：1

Speaker Verification with Model-based and Score-based Unsupervised Adaptation Method

下载PDF

导出

摘要在说话人识别的研究中,使用以前的测试语句信息对模型参数或者测试得分进行动态更新,使模型可以更精确地反映测试语句和说话人模型之间的关系,这种更新策略称为非监督模式,这方面的研究对实际的说话人识别系统具有非常重要的意义.本文除了采用非监督的说话人模型自适应更新方法之外,还提出了非监督的得分域自适应算法:首先采用双高斯函数对得分建立一个先验的得分模型,利用最大后验概率准则对得分规整的模型进行调整.在测试过程中,采用得分域和模型域的非监督算法可以互相补充,提高识别率,在NISTSRE2006年1训练语段-1测试语段数据库上,使用模型域和得分域非监督自适应的系统能够取得等错误率4.3%和检测代价函数0.021的结果. In the text-independent speaker verification research, the information of previous trials can be adopted to update the speaker models or the test scores dynamically. This process is defined as the unsupervised mode, which can make a coupling between the trials and the speaker models. The unsupervised mode is very useful for real speaker recognition application. In this paper, a score-based unsupervised adaptation is proposed as well as model-based unsupervised adaptation. In the score-based unsupervised adaptation mode, a bi-Gaussian model is introduced as a prior score distribution. Then the MAP （maximum a posteriori） method is adopted to adjust the parameters of the score normalization. In the test process, the unsupervised score adaptation and unsupervised model adaptation can both improve the performance. In the case of NIST SRE 2006 lconv4w-lconv4w corpus, the equal error rate （EER） of the proposed system is 4.3% and the minimum detection cost function （minDCF） is 0.021.

作者王尔玉郭武李轶杰戴礼荣王仁华

机构地区中国科学技术大学电子工程与信息科学系科大讯飞语音实验室

出处《自动化学报》 EI CSCD 北大核心 2009年第3期267-271,共5页 Acta Automatica Sinica

基金国家高技术研究发展计划(863计划)(2006AA010104)资助~~

关键词说话人确认混合高斯模型非监督模式得分规整 Speaker verification, Gaussian mixture model （GMM）, unsupervised mode, score normalization

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献10

1Vogt R, Sridharan S. Experiments in session variablity modeling for speaker verification. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Washington D. C., USA: IEEE, 2006. 897-900.
2Campbell W M, Sturim D E, Reynolds D A. SVM based speaker verification using a CMM supervector kernel and NAP variability compensation. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Washington D. C., USA: IEEE, 2006. 97-100.
3The NIST. Speaker Recognition Evaluation Plan [Online], available: http://www.nist.gov/speech/tests/sre, September 1, 2008.
4Vair C, Colibro D, Castaldo F, Dalmasso E, Laface P. Loquendo-politecnico di Torino's 2006 NIST speaker recognition evaluation system. In: Proceedings of the 8th Conference in the Annual Series of INTERSPEECH Events and the 10th Biennial EUROSPEECH Conference. Antwerp, Belgium: ISCA, 2007. 1238-1241.
5Matejka P, Burget L, Schwarz P, Glembek O, Karafiat M, Grezl F. STBU system for the NIST 2006 speaker recognition evaluation. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Washington D. C., USA: IEEE, 2007. 221-224.
6Pretil A, Bonastrel J F, Matrouf D. Confidence measure based unsupervised target model adaptation for speaker verification. In: Proceedings of the 8th Conference in the Annual Series of INTERSPEECH Events and the 10th Biennial EUROSPEECH Conference. Antwerp, Belgium: ISCA, 2007. 754-757.
7Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 2000, 10(1-3): 19-41.
8Campbell W M, Sturim D E, Reynolds D A. Support vector machines using CMM supervectors for speaker verification. IEEE Signal Processing Letters, 2006, 13(5): 308-311.
9Castaldo F, Colibro D, Dalmasso E, Laface P, Vair C. Compensation of nuisance factors for speaker and language recognition. IEEE Transaztions on Audio, Speech, and Language Processing, 2007, 15(7): 1969-1978.
10Bimbot F, Bonastre J F, Fredouille C, Gravier G, Magrin- Chagnolleau I, Meignier S. A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing, 2004, 2004(4): 430-451.

同被引文献8

1李战明,陈迪.一种基于小波神经网络混合模型的说话人识别方法[J].兰州理工大学学报,2007,33(2):77-80. 被引量：3
2Lung S Y.Wavelet feature domain adaptive noise reduction using learning algorithm for text-independent speaker recognition[J]. Pattern Recognition, 2007,40: 2603-2606.
3Milner B.Inclusion of temporal information into features for speech recognition[C]//Proceedings of the ICSLP'96,1996:256-259.
4Minematsu N, Asakawa S, Hirose K.Automatic recognition of connected vowels only using speaker-invariant representation of speech dynamics[C]//Proceedings of the 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31,2007: 890-893.
5Ross M, Shaffer H, Cohen A, et al.A verage magnitude difference function pitch extractor[J].IEEE Trans on Acoustics, Speech, and Siz-nal Processing. 1974.22(5) :353-362.
6Chao Y H, Tsai W H, Wang H M.Improving GMM-UBM speaker verification using discriminative feedback adaptation[J].Computer Speech and Language,2009,23:376-388.
7Navrafil J, Ramaswamy G N.The AWE and mystery of t-norm[C]// Proceedings of the EUROSPEECH 2003, Geneva, Switzerland, 2003 : 2009-2012.
8李明,张勇,李军权,张亚芬.一种实用的说话人特征提取方法[J].计算机工程与应用,2008,44(10):51-53. 被引量：2

引证文献1

1李战明,林娟,陈若珠.组合特征和二级判断模型相结合的说话人识别[J].计算机工程与应用,2011,47(10):180-182. 被引量：3

二级引证文献3

1L Ying,LUO Senlin,GAO Xiaofang,XIE Erman,PAN Limin.A rapid audio event detection method by adopting 2D-Haar acoustic super feature vector[J].Chinese Journal of Acoustics,2015,34(2):186-202. 被引量：1
2吕英,罗森林,高晓芳,谢尔曼,潘丽敏.采用2D-Haar声学特征超向量的快速特定音频识别方法[J].声学学报,2015,40(5):739-750. 被引量：2
3罗元,吴承军,张毅,黎小松,席兵.Mel频率下基于LPC的语音信号深度特征提取算法[J].重庆邮电大学学报（自然科学版）,2016,28(2):174-179. 被引量：12

1Part 2 我们到底需要多快的CPU？[J].电脑自做,2004(1):37-42.
2guigui.自己动手，获得史上最高的显卡测试得分！[J].家用电脑世界,2002(9):86-88.
3王欣.风起云涌——2006年度笔记本电脑专题测试——性能测试得分表[J].个人电脑,2006,12(12):58-59.
4王烨,屈丹,李弼程,刘崧.基于子空间映射和得分规整的GSV-SVM方言识别[J].计算机工程与设计,2013,34(1):278-282. 被引量：1
5王宪亮,袁庆升,包秀国,张健,万玉龙,周若华,颜永红.基于SVM一对多得分规整的语种识别方法[J].网络新媒体技术,2015,4(6):27-30.
6刘希亮,陈桂明,李方溪,张倩.多源信息融合及其在齿轮泵故障诊断中的应用[J].液压与气动,2012,36(6):118-122. 被引量：3
7谭萍,邢玉娟,高翔.说话人模型聚类算法研究与分析[J].中国建材科技,2015,24(5):87-88.
8胡学海,王厚军,古天祥.基于最大后验概率的K/N规则研究[J].电子测量与仪器学报,2007,21(5):22-25. 被引量：2
9王正创.基于MFCC与共振峰的声纹识别算法研究[J].电脑知识与技术,2016,0(2):188-190.
10刘希亮,陈桂明,李方溪,张倩.基于多传感器信息融合的贝叶斯网络故障诊断方法研究及应用[J].机械科学与技术,2013,32(1):91-95. 被引量：7

自动化学报

2009年第3期

浏览历史

内容加载中请稍等...

采用模型和得分非监督自适应的说话人识别被引量：1

参考文献10

同被引文献8

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

采用模型和得分非监督自适应的说话人识别 被引量：1

参考文献10

同被引文献8

引证文献1

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

采用模型和得分非监督自适应的说话人识别被引量：1