期刊文献+

基于最大似然可变子空间的快速说话人自适应方法 被引量:3

Rapid Speaker Adaptation Based on Maximum-likelihood Variable Subspace
下载PDF
导出
摘要 该文提出一种基于最大似然可变子空间的说话人自适应方法。在训练阶段,对训练集中的说话人相关模型参数进行主分量分析,得到一组说话人基矢量;在自适应阶段,通过最大似然准则选取与当前说话人相关性最大的基矢量子集,进而将新的说话人相关模型限制在这组基矢量所张成的说话人子空间中,通过求解每一个基矢量对应的系数从而进行说话人自适应。与经典的基于子空间的说话人自适应方法不同,该文中的说话人子空间是在自适应阶段动态选取的,所需要估计的参数更少,在少量自适应数据下可以得到更稳健的自适应结果。在基于微软语料库的连续语音识别自适应实验中,给定极少量自适应数据(小于5 s),在有监督和无监督条件下,该文方法均优于经典的本征音自适应方法和基于最大似然线性回归的方法。 A new rapid speaker adaptation method based on maximum likelihood variable subspace is proposed.A set of bases of the speaker space is obtained by performing Principal Component Analysis(PCA) on the Speaker Dependent(SD) model parameters of the training speakers.Different from conventional subspace based methods,during speaker adaptation,a subset of these bases is dynamically chosen for each speaker using maximum likelihood criteria.The new speaker's model is constrained in the subspace spanned by those bases.With less free parameters required,the new method can obtain more robust SD model using very little amount of adaptation data.Speech recognition experiments show that the new method can obtain better performance than the eigenvoice method and MLLR method,both in supervised mode and in unsupervised mode.
出处 《电子与信息学报》 EI CSCD 北大核心 2012年第3期571-575,共5页 Journal of Electronics & Information Technology
基金 国家自然科学基金(60872142)资助课题
关键词 连续语音识别 说话人自适应 本征音 子空间方法 Continuous speech recognition Speaker adaptation Eigenvoice Subspace method
  • 相关文献

参考文献10

  • 1Lee C H,Lin C H,and Juang B H.A study on speakeradaptation of the parameters of continuous density hiddenMarkov models[J].IEEE Transactions on Signal Processing,1991,39(4):806-814.
  • 2李虎生,刘加,刘润生.语音识别说话人自适应研究现状及发展趋势[J].电子学报,2003,31(1):103-108. 被引量:32
  • 3Ghoshal A,Povey D,Agarwal M,et al..A novel estimationof feature-space MLLR for full-covariance models[C].International Conference on Acoustics,Speech and SignalProcessing,Dallas,Texas,USA,2010:4310-4313.
  • 4Kuhn R,Junqua J C,Nguyen P,et al..Rapid speakeradaptation in eigenvoice space[J].IEEE Transactions onSpeech and Audio Processing,2000,8(6):695-707.
  • 5Teng W X,Gravier G,Bimbot F,et al..Rapid speakeradaptation by reference model interpolation[C].Interspeech,Antwerp,Belgium,2007:258-261.
  • 6Teng W X,Gravier G,Bimbot F,et al..Speaker adaptationby variable reference model subspace and application tolarge vocabulary speech recognition[C].InternationalConference on Acoustics,Speech and Signal Processing,Taipei,China,2009:4381-4384.
  • 7Jeong Y and Sim H S.New speaker adaptation method using2-D PCA[J].IEEE Signal Processing Letters,2010,17(2):193-196.
  • 8Jeong Y.Speaker adaptation based on the multilineardecomposition of training speaker models[C].InternationalConference on Acoustics,Speech and Signal Processing,Dallas,Texas,USA,2010:4870-4873.
  • 9Young S,Evermann G,Gales M,et al..The HTK Book.HTKVersion 3.4,2009.
  • 10Chang E,Shi Y,Zhou J,et al..Speech lab in a box:aMandarin speech toolbox to jumpstart speech relatedresearch[C].EUROSPEECH-2001,Aalborg,Denmark,2001:2799-2802.

二级参考文献2

共引文献31

同被引文献12

引证文献3

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部