期刊文献+

音素层特征超矢量的说话人识别性能及优化

Study on performance and improvement of speaker recognition of phoneme feature supper vector
下载PDF
导出
摘要 音素层特征等高层信息的参数由于完全不受信道的影响,被认为可对基于声学参数的低层信息系统进行有益的补充,但高层信息存在数据稀少的缺点。建立了基于音素特征超矢量的识别方法,并采用BUT的音素层语音识别器对其识别性能进行分析,进而尝试通过数据裁剪和KPCA映射的方法来提升该识别方法的性能。结果表明,采用裁剪并不能有效提升其识别性能,但融合KPCA映射的识别算法的性能得到了显著提升。进一步与主流的GMM-UBM系统融合后,相对于GMM-UBM系统,EER从8.4%降至6.7%。 As being hard to be influenced by the channel situation,the higher level information,such as phoneme feature,is recognized to be a good complementarity to the current speaker recognition technology based on lower level information,such as acoustic information.However,the higher level speech information has their inherent limitations of data sparsity.Based on the BUT speaker recognizer platform,the performance of the speaker recognition method based on phoneme feature super vector is analyzed and evaluated.The method of data pruning and Kernel Principal Component Analysis(KPCA) are introduced to improve its recognition performance.Results show that the recognition performance is not effectively improved by the data pruning method,but is greatly enhanced when the KPCA is used.Furthermore,when the current system is integrated with GMM-UBM(Gaussian Mixture Model-Universal Background Model) system,the EER(Equal Error Rate) of the GMM-UBM system can be lowered down from 8.4% to 6.7%.
出处 《计算机工程与应用》 CSCD 北大核心 2011年第26期140-142,共3页 Computer Engineering and Applications
基金 国家自然科学基金No.60970161 中央高校基本科研业务费专项资金项目 安徽省高校优秀青年人才基金~~
关键词 音素层特征 说话人识别 核函数主元分析 数据裁剪 phoneme feature speaker recognition Kernel Principal Component Analysis(KPCA) data pruning
  • 相关文献

参考文献12

  • 1董志峰,汪增福.基于动态MFCC的说话人识别算法[J].模式识别与人工智能,2005,18(5):596-601. 被引量:7
  • 2Reynolds D A, Quatieri T F,Duma R B.Speaker verification using adapted Gaussian mixture models[J].Digital Signal Processing, 2000,10(3) : 19-41.
  • 3鲍焕军,郑方.GMM-UBM和SVM说话人辨认系统及融合的分析[J].清华大学学报(自然科学版),2008,48(S1):693-698. 被引量:9
  • 4Reynolds D A,Andrews W,Campbell J.The SuperSID project: exploiting high-level information for high-accuracy speaker rec- ognition[C]//Proeedings of ICASSP Conference, 2003 : 784-787.
  • 5Nello C,Jhon S T.Support vector machines[M].Cambridge U K: Cambridge University Press,2000.
  • 6姚红,梁栋,郭武.基于模型距离和支持向量机的说话人确认[J].计算机仿真,2009,26(3):343-346. 被引量:2
  • 7李邵梅,郭云飞,卫红权.基于分布特征统计的说话人识别[J].计算机工程与应用,2009,45(34):118-120. 被引量:2
  • 8Campbell W M, Campbell J P, Gleason T P, et al.Speaker verification using support vector machines and high-level features[J]. IEEE Trans, Audio, Speech, Lang, 2007, 15 (7) :2085-2094.
  • 9BUT.Phoneme recognizer based on long temporal context[EB/OL]. [2010-02].http://speech.fit.vutbr.cz/software.
  • 10Collobert R.SVMTorch:a support vector machine for large-scale regression and classification problems[EB/OL].[2010-02].http:// bengio.abracadoudou.com/projects/SVMTorch.html.

二级参考文献30

  • 1张庆芳,赵鹤鸣.基于改进VQ算法的文本无关的说话人识别[J].计算机工程与应用,2006,42(10):65-68. 被引量:7
  • 2Douglas A Reynolds, Thomas F Quatieri and Robert B Dunn. Speaker verification using adapted Gaussian mixture models [ J ]. Digital Signal Processing, Academic Press, 2000, 10 : 19 - 41.
  • 3W M Campbell, J P Campbell, D A Reynolds. Support vector machines for speaker and language recognition[ J]. Computer Speech and Language, 2006, 20:210 - 229.
  • 4W M Campbell, D E Sturim, D A Reynolds. Support Vector Machines Using GMM Supervectors for Speaker Verification [ J ]. IEEE Signal Processing Letters, 2006, 13(5) :308 -311.
  • 5H Beigi, S Maes and J Sorensen. A Distance Measure Between Collections Of Distributions and Its Application to Speaker Recognition [ C ]. In Proc. ICASSP, 1998, 2:753 - 756.
  • 6Jacob Goldberger, Shift Gordon, Hayit Greenspan. An Efficient Image Similarity Measure Based on approximations of KL - Divergence Between Two Gaussian Mixtures [ C ]. In Proc. ICCV03, 2003. 487 - 493.
  • 7Bing Xiang, Upendra V Chaudhari, Jir'l Navr'atil. Short -time Gaussianization for robust speaker verification [ C ]. In Proc. ICASSP02, 2002, 1:681 - 684.
  • 8Young K H, et al. Pitch Detection with Average Magnitude Difference Function Using Adaptive Threshold Algorithm for Estimating Shimmer and Jitter. In: Proc of the 20th IEEE International Annual Conference on Engineering in Medicine and Biology Society. Hong Kong, China, 1998, Ⅵ:3162-3164.
  • 9Wang Y R, Wong I J, Tsao T C. A Statistical Pitch Detection Algorithm. In.. Proc of the IEEE International Conference on Acoustics, Speech, and Signal Processing. Orlando, USA,2002,Ⅰ:13--17.
  • 10Hung W W, Wang H C. On the Use of Weighted Filter Bank Analysis for the Derivation of Robust MFCCs. IEEE Signal Processing Letters, 2001, 8(3):70--73.

共引文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部