期刊文献+

采用模型和得分非监督自适应的说话人识别 被引量:1

Speaker Verification with Model-based and Score-based Unsupervised Adaptation Method
下载PDF
导出
摘要 在说话人识别的研究中,使用以前的测试语句信息对模型参数或者测试得分进行动态更新,使模型可以更精确地反映测试语句和说话人模型之间的关系,这种更新策略称为非监督模式,这方面的研究对实际的说话人识别系统具有非常重要的意义.本文除了采用非监督的说话人模型自适应更新方法之外,还提出了非监督的得分域自适应算法:首先采用双高斯函数对得分建立一个先验的得分模型,利用最大后验概率准则对得分规整的模型进行调整.在测试过程中,采用得分域和模型域的非监督算法可以互相补充,提高识别率,在NISTSRE2006年1训练语段-1测试语段数据库上,使用模型域和得分域非监督自适应的系统能够取得等错误率4.3%和检测代价函数0.021的结果. In the text-independent speaker verification research, the information of previous trials can be adopted to update the speaker models or the test scores dynamically. This process is defined as the unsupervised mode, which can make a coupling between the trials and the speaker models. The unsupervised mode is very useful for real speaker recognition application. In this paper, a score-based unsupervised adaptation is proposed as well as model-based unsupervised adaptation. In the score-based unsupervised adaptation mode, a bi-Gaussian model is introduced as a prior score distribution. Then the MAP (maximum a posteriori) method is adopted to adjust the parameters of the score normalization. In the test process, the unsupervised score adaptation and unsupervised model adaptation can both improve the performance. In the case of NIST SRE 2006 lconv4w-lconv4w corpus, the equal error rate (EER) of the proposed system is 4.3% and the minimum detection cost function (minDCF) is 0.021.
出处 《自动化学报》 EI CSCD 北大核心 2009年第3期267-271,共5页 Acta Automatica Sinica
基金 国家高技术研究发展计划(863计划)(2006AA010104)资助~~
关键词 说话人确认 混合高斯模型 非监督模式 得分规整 Speaker verification, Gaussian mixture model (GMM), unsupervised mode, score normalization
  • 相关文献

参考文献10

  • 1Vogt R, Sridharan S. Experiments in session variablity modeling for speaker verification. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Washington D. C., USA: IEEE, 2006. 897-900.
  • 2Campbell W M, Sturim D E, Reynolds D A. SVM based speaker verification using a CMM supervector kernel and NAP variability compensation. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Washington D. C., USA: IEEE, 2006. 97-100.
  • 3The NIST. Speaker Recognition Evaluation Plan [Online], available: http://www.nist.gov/speech/tests/sre, September 1, 2008.
  • 4Vair C, Colibro D, Castaldo F, Dalmasso E, Laface P. Loquendo-politecnico di Torino's 2006 NIST speaker recognition evaluation system. In: Proceedings of the 8th Conference in the Annual Series of INTERSPEECH Events and the 10th Biennial EUROSPEECH Conference. Antwerp, Belgium: ISCA, 2007. 1238-1241.
  • 5Matejka P, Burget L, Schwarz P, Glembek O, Karafiat M, Grezl F. STBU system for the NIST 2006 speaker recognition evaluation. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Washington D. C., USA: IEEE, 2007. 221-224.
  • 6Pretil A, Bonastrel J F, Matrouf D. Confidence measure based unsupervised target model adaptation for speaker verification. In: Proceedings of the 8th Conference in the Annual Series of INTERSPEECH Events and the 10th Biennial EUROSPEECH Conference. Antwerp, Belgium: ISCA, 2007. 754-757.
  • 7Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 2000, 10(1-3): 19-41.
  • 8Campbell W M, Sturim D E, Reynolds D A. Support vector machines using CMM supervectors for speaker verification. IEEE Signal Processing Letters, 2006, 13(5): 308-311.
  • 9Castaldo F, Colibro D, Dalmasso E, Laface P, Vair C. Compensation of nuisance factors for speaker and language recognition. IEEE Transaztions on Audio, Speech, and Language Processing, 2007, 15(7): 1969-1978.
  • 10Bimbot F, Bonastre J F, Fredouille C, Gravier G, Magrin- Chagnolleau I, Meignier S. A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing, 2004, 2004(4): 430-451.

同被引文献8

  • 1李战明,陈迪.一种基于小波神经网络混合模型的说话人识别方法[J].兰州理工大学学报,2007,33(2):77-80. 被引量:3
  • 2Lung S Y.Wavelet feature domain adaptive noise reduction using learning algorithm for text-independent speaker recognition[J]. Pattern Recognition, 2007,40: 2603-2606.
  • 3Milner B.Inclusion of temporal information into features for speech recognition[C]//Proceedings of the ICSLP'96,1996:256-259.
  • 4Minematsu N, Asakawa S, Hirose K.Automatic recognition of connected vowels only using speaker-invariant representation of speech dynamics[C]//Proceedings of the 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31,2007: 890-893.
  • 5Ross M, Shaffer H, Cohen A, et al.A verage magnitude difference function pitch extractor[J].IEEE Trans on Acoustics, Speech, and Siz-nal Processing. 1974.22(5) :353-362.
  • 6Chao Y H, Tsai W H, Wang H M.Improving GMM-UBM speaker verification using discriminative feedback adaptation[J].Computer Speech and Language,2009,23:376-388.
  • 7Navrafil J, Ramaswamy G N.The AWE and mystery of t-norm[C]// Proceedings of the EUROSPEECH 2003, Geneva, Switzerland, 2003 : 2009-2012.
  • 8李明,张勇,李军权,张亚芬.一种实用的说话人特征提取方法[J].计算机工程与应用,2008,44(10):51-53. 被引量:2

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部