期刊文献+

基于三对角和共享分块对角转换矩阵的快速说话人自适应方法

Fast Speaker Adaptation Based on Triple Diagonal Transform Matrices and Shared Block Matrices
下载PDF
导出
摘要 本文提出了两种在最大似然线性回归 (MLLR)框架下实现快速说话人自适应的方法 .这两种方法在本文中分别称为Log 谱域下基于三对角转换矩阵的说话人自适应 (SATD)和倒谱域下基于共享分块对角转换矩孟加拉国说话人自适应 (SASBD) .这两种方法在一定先验知识的基础上采用较少的参数来描述说话人间的差异 ,因而只需要少量的自适应数据就可以得到参数的鲁棒估计 .在以整词建模的孤立词识别系统和以三音子建模的孤立词识别系统上分别进行的测试表明所提出的方法相对传统的MLLR自适应方法有较快的自适应性能 . In the Maximum Likelihood Linear Regression (MLLR) framework, this paper proposes two fast speaker adaptation approaches, which are called Speaker Adaptation using Triple Diagonal matrices in the log-spectral domain (SATD) and Speaker Adaptation using Shared Block Diagonal matrices (SASBD) in the cepstral domain, respectively. Based on some prior knowledge, the proposed approaches utilize fewer parameters to describe the variation between speakers, and thus fewer adaptation data are needed to give robust estimation. Experimental results in both the whole-word-modeled isolated word recognition system and the isolated word recognition system using triphones as modeling units show that the proposed approaches can provide faster performance than the traditional MLLR approaches.
作者 丁国宏 徐波
出处 《电子学报》 EI CAS CSCD 北大核心 2004年第10期1709-1712,1719,共5页 Acta Electronica Sinica
关键词 快速自适应 转换矩阵 MLLR 三对角矩阵 分块对角矩阵 Calculations Mathematical models Matrix algebra Maximum likelihood estimation Regression analysis Word processing
  • 相关文献

参考文献7

  • 1Ding G -H,et al.Implementing vocal length normalization in the MLLR framework[A].Proceedings of International Conference on Spoken Language Processing[C].Denver:Causal Productions,2002.1389-1392.
  • 2Ding G -H,et al.Transform-based fast speaker adaptation using triple diagonal and shared block diagonal matrices[A].Proceedings of International Conference on Acoustics,Speech and Signal Processing[C].Hong Kong:IEEE Signal Processing Society,2003,1.300-30
  • 3Lee L,et al.A frequency warping approach to speaker normalization[J].IEEE Transactions on Speech and Audio Processing,1998,6(1):49-60.
  • 4Gales M J F,et al.Mean and variance adaptation within the MLLR framework[J].Computer Speech and Language,1996,10(4):249-264.
  • 5Digalakis V V,et al.Speaker adaptation using constrained estimation of Gaussian mixtures[J].IEEE transactions on speech and audio processing,1995,3(5):357-366.
  • 6Huang C,et al.Speaker selection training for large vocabulary continuous speech recognition[A].Proceedings of International Conference on Acoustics,Speech and Signal Processing[C].Orlando:IEEE Signal Processing Society,2002,1.609-672.
  • 7Chen K T,et al.Fast speaker adaptation using eigenspace-based maximum likelihood linear regression[A].Proceedings of International Conference on Spoken Language Processing[C].Beijing:China Military Friendship Publish,2000,3.742-745.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部