期刊文献+

应用于语种识别的加权音素对数似然比特征 被引量:4

Weighted phone log-likelihood ratio feature for spoken language recognition
原文传递
导出
摘要 语种识别的关键问题之一是提取语音信号中的语种鉴别性信息。近期,音素对数似然比(phone log-likelihood ratio,PLLR)的新特征被引入语种识别领域,并表现出了优异的性能。该文利用F比方法分析了PLLR特征向量各维的语种鉴别性大小,提出了加权音素对数似然比(weighted PLLR,WPLLR)特征,赋予PLLR特征中含有较多语种鉴别性信息的分量较高的权重。在美国国家标准技术署(National Institute of Standards and Technology,NIST)2007年语种识别测试集上的实验结果表明:相比于原PLLR特征,该文所提出的WPLLR特征在平均检测代价和等错率2个指标上都显著降低。 The extraction of linguistic discriminative features is one of the fundamental issues in spoken language recognition (SLR). The frame level phone log-likelihood ratio (PLLR) has been recently introduced to improve language recognition. In this paper, the F ratio analysis method is used to analyze the contributions of different SLR feature vector dimensions. Then, a weighted phone log likelihood ratio (WPLLR) feature is used to more heavily weight those dimensions with high F-ratio values. Tests on the National Institute of Standards and Technology (NIST) 2007 dataset for SLR show the effectiveness of this feature, with significant relative improvements in the average cost performance and equal error rate compared with the PI.LR feature.
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2017年第10期1038-1041,1047,共5页 Journal of Tsinghua University(Science and Technology)
基金 国家自然科学基金资助项目(11461141004 91120001 61271426) 国家"八六三"高技术项目(2012AA012503) 中国科学院战略性先导科技专项(XDA06030100 XDA06030500) 中科院重点部署项目(KGZD-EW-103-2)
关键词 语音信号处理 语种识别 语种鉴别性 加权音素对数似然比(WPLLR) F比 speech signal processing linguistic discrimination spoken language recognition weighted phone log-likelihood ratio (WPLLR) F-ratio
  • 相关文献

参考文献1

二级参考文献13

  • 1WANG Haipeng, Leung C C, Lee T, et al. Shifted delta MLP features for spoken language recognition [J]. Signal Processing Letters, 2013, 20(1) : 15 - 18.
  • 2Torres-Carrasquillo P A, Singer E, Kohler M A, et al. Approaches to language identification using Gaussian mixture models and shifted delta cepstral features [C]//Proc ICSLP. Denver, CO, USA, 2002: 33-36.
  • 3CampbeU W M, Campbell J P, Reynolds D A, et al. Support vector machines for speaker and language recognition [J]. Computer Speech and Language, 2006, 20(2): 210- 229.
  • 4LEI Yun, Hansen J H L. Dialect classification via text independent training and testing for Arabic, Spanish, and Chinese [J]. Audio, Speech, and Language Processing, 2011, 19(1): 85-96.
  • 5Torres Carrasquillo P A, Reynolds D A, Deller J R. Language identification using Gaussian mixture model tokenization [C]// 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Orlando, FL, USA: 2002, 1: 757-760.
  • 6Ronan C, Samy B. SVMTorch: Support vector machines for large-scale regression problems [J]. Journal of Machine Learning Research, 2001, 1 : 143 - 160.
  • 7Cumani S, Laface P. Analysis of large-scale SVM training algorithms for language and speaker recognition [J]. Audio, Speech, and Language Processing, 2012, 20(5): 1585- 1596.
  • 82011 Language Recognition Evaluation [Z/OL]. (2011-08-21). [2013-03-06]. http: //www. nist. gov/itl/ iad/mig/lrell, elm.
  • 9Martin A, Le A. NIST 2007 language recognition evaluation [C]//Proc of Odyssey. Stellenbosch, South Africa, 2008.
  • 10Castaldo F, Colibro D, Dalmasso E, et al. Compensation of nuisance factors for speaker and language recognition [J]. Audio, Speech, and Language Processing, 2007, 5(7) : 1969 - 1978.

共引文献9

同被引文献33

引证文献4

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部