期刊文献+

联合因子分析算法中基于信号子空间的空间变换方法 被引量:2

Space Transformation Based on Signal Subspace in Joint Factor Analysis
下载PDF
导出
摘要 在文本无关的说话人确认系统中,联合因子分析算法以其明确的空间估计方法成为主流的技术手段.然而由于算法流程的限制,使用该算法得到的说话人空间和信道空间不可避免地产生重叠.为解决空间模型的重叠问题,文中采用基于信号子空间的空间变换方法,使空间模型分离.对于NIST SRE 2008核心测试任务中的电话信道注册-电话信道测试,相对于不采用空间变换的联合因子分析算法,取得9.2%等错误率的降低. Joint factor analysis (JFA) is the mainstream algorithm in the text-independent speaker verification systems due to its clear method of modeling the spaces. However, the inevitable overlaps between the speaker space and the channel space obtained by JFA are caused because of the limitations of the algorithm process. To resolve this problem, the space transformation based on the signal subspace is proposed. Compared with JFA algorithm without the space transformation, an equal error rate (EER) reduction of 9.2% is obtained on the telephone section of the cor,~ oa,dlt;,,,, ~,4~1~ ^,r ,L~ xT,o
出处 《模式识别与人工智能》 EI CSCD 北大核心 2013年第8期705-710,共6页 Pattern Recognition and Artificial Intelligence
关键词 说话人确认 联合因子分析(JFA) 信号子空间 空间变换 Speaker Verification, Joint Factor Analysis (JFA), Signal Subspace, Space Transformation
  • 相关文献

参考文献10

  • 1Reynolds D A, Quatieri T F, Dunn R B. Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing, 2000, 10(112/3): 19-41.
  • 2Kenny P, Dumouchel P. Experiments in Speaker Verification Using Factor Analysis Likelihood Ratios II Proc of the ODYSSEY: Speaker and Language Recognition Workshop. Toledo, Spain, 2004: 219- 226.
  • 3Kenny P, Boulianne G, Dumouchel P. Eigenvoice Modeling with Sparse Training Data. IEEE Trans on Speech and Audio Processing, 2005,13(3): 345-354.
  • 4Kenny P, Boulianne G, Quellet P, et al. Speaker and Session Va?riability in GMM - Based Speaker Verification. IEEE Trans on Audio, Speech and Language Processing, 2007, 15 ( 4): 1448 -1460.
  • 5Dehak N, Dumouchel P, Kenny P. Modeling Prosodic Features withJoint Factor Analysis for Speaker Verification. IEEE Trans on Audio, Speech and Language Processing, 2007, 15(7): 2095-2103.
  • 6郭武,李轶杰,戴礼荣,王仁华.说话人识别中的因子分析以及空间拼接[J].自动化学报,2009,35(9):1193-1198. 被引量:14
  • 7Campbell W M, Sturim DE, Reynolds D A, et al. S VM Based Speaker Verification Using a GMM Supervector Kernel and NAP Variability Compensation[EB/OLJ.[2012 - 09 - 10]. http:// citeseerx. ist. psu. edulviewdoc/download? doi = 10. 1. 1. 208. 4140&rep= repl &type = pdf.
  • 8何亮,史永哲,刘加.联合因子分析中的本征信道空间拼接方法[J].自动化学报,2011,37(7):849-856. 被引量:8
  • 9Kenny P, Dehak N, Gupta V, et al. A New Training Regimen for Factor Analysis of Speaker Variability[EB/OL].[2012 - 09- 10]. http://www.crim.ca/perso/patrick. kenny/Kenny _ ICASSP08. pdf.
  • 10Auckenthaler R, Carey M, Thomas H 1. Score Normalization for Text-Independent Speaker Verification System. Digital Signal Pro?cessing, 2000,10(112/3): 42-54.

二级参考文献22

  • 1Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 2000, 10(1-3): 19-41.
  • 2Campbell W M, Sturim D E, Reynolds D A. Support vector machines using GMM supervectors for speaker verification. IEEE Signal Processing; Letters, 2006, 13(5): 308-311.
  • 3Kenny P, Boulianne G, Ouellet P, Dumouchel P. Speaker and session variability in GMM-based speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1448-1460.
  • 4Vogt R, Sridharan S. Experiments in session variability modeling for speaker verification. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing. Toulouse, France: IEEE, 2006. 897-900.
  • 5Castaldo F, Colibro D, Dalmasso E, Laface P, Vair C. Compensation of nuisance factors for speaker and language recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(7): 1969-1978.
  • 6Kenny P, Ouellet P, Dehak N, Gupta V, Dumouchel P. A study of inter-speaker variability in speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2008, 16(5): 980-988.
  • 7Kenny P, Boulianne G, Dumouchel P. Eigenvoice modeling with sparse training data. IEEE Transactions on Audio, Speech, and Lnnguage Processing, 2005, 13(3): 345-354.
  • 8Kenny P, Boulianne G, Ouellet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1435-1447.
  • 9NIST. The NIST Year 2008 Speaker Recognition Evaluation Plan [Online], available: http://www.nist.gov/speech/tests /sre/2008/index.html, March 20, 2008.
  • 10Bishop C M. Pattern Recognition and Machine Learning. Berlin: Springer, 2008. 583-586.

共引文献17

同被引文献15

  • 1Pan Y C, Lee L S. Performance Analysis for Lattice-Based Speech Indexing Approaches Using Words and Subword Units. IEEE Trans on Audio, Speech, and Language Processing, 2010, 18(6) : 1562- 1574.
  • 2Akbacak M, Burget L, Wang W, et al. Rich System Combinationfor Keyword Spotting in Noisy and Acoustically Heterogeneous Audio Streams// Proc of the IEEE International Cmfference on Acoustics, Speech and Signal Processing. Vancouver, Canada, 2013: 8267- 8271.
  • 3Audhkhasi K, Verma A. Keyword Search Using Modified Minimum Edit Distance Measure//Proc of the IEEE Intemational Conference on Acoustic, Speech and Signal Processing. Honolulu, USA, 2007, IV : 929-932.
  • 4Wallace R, Vogt R, Sridharan S. Spoken Term Detection Using Fast Phonetic Decoding// Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Taipei, China, 2009: 4881-4884.
  • 5Rajabzadeh M, Tabibian S, Akbari A, et al. Improved Dynamic Match Phone Lattice Search Using Viterbi Scores and Jaro Winkler Distance for Keyword Spotting System//Proc of the 16th CSI Inter- national Symposium on Artificial Intelligence and Signal Processing. Shiraz, Iran, 2012 : 423-427.
  • 6Wessel F, Sehluter R, Macherey K, et al. Confidence Measures for Large Vocabulary Continuous Speech Recognition. IEEE Trans on Speech and Audio Processing, 2001, 9(3) : 288-298.
  • 7Sehwarz P. Phoneme Recognition Based on Long Temporal Con- text. [ EB/OL]. [2013-08- 10]. http://www, fit. vutbr, cz/ reach/groups/speech/pubh/2009/schwarz-thesis, pdf.
  • 8Tiiske Z, Plahl C, Schltiter R. A Study on Speaker Normalized MLP Features in LVCSR//Proc of the 12th Annual Conference of the International Speech Communication Association. Florence, Italy, 2011 : 1089-1092.
  • 9Wallace R. Fast and Accurate Phonetic Spoken Term Detection. Ph. D Dissertation. Brisbane, Australia: Queensland University of Technology, 2010.
  • 10Fiscus J G, Ajot J S, Garofolo J, et al. Results of the 2006 Spoken Term Detection Evaluation [ EB/OL]. [ 2013 -09-25 1- http:// www. itl. nist. gov/iad/mig//publications/storage_paper/Interspeeeh07 - STDO6-vl3. pdf.

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部