期刊文献+

基于深度神经网络的语种识别 被引量:7

Language Identification Based on Deep Neural Network
下载PDF
导出
摘要 语音段的有效表示方法存在易混淆语种和短时语音段识别率较低等问题,为满足不同时长和方言的识别要求,提出基于深度神经网络不同层的有效语音段表示方法.采用含有中间瓶颈层的深层神经网络作为前端特征提取,综合利用该网络的输出层和中间瓶颈层输出结果,得到不同形式的语音段表示并用于语种识别.在美国国家标准技术局语种识别评测2009年和2011年阿拉伯方言数据集上验证了方法的有效性. Aiming at the problems of confusable dialects and short-duration utterance in automatic spoken language identification (LID), an improved utterance representation method is proposed based on different layers of deep neural network ( DNN ). Deep bottleneck network ( DBN ), a DNN with an internal bottleneck layer, is employed as a front-end feature extractor. Different representations based on output layer and middle bottleneck layer of DBN for LID are obtained and fused. Evaluations on the NIST LRE2009 dataset and NIST LRE2011 Arabic dialect dataset demonstrate that the proposed method based on DBN achieves good performance.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2015年第12期1093-1099,共7页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金项目(No.61172158)资助
关键词 语种识别 深度神经网络 语音段表示 深度瓶颈特征 Language Identification, Deep Neural Network, Utterance Representation, DeepBottleneck Feature
  • 相关文献

参考文献19

  • 1Zissman M A. Comparison of Four Approaches to Automatic Language Identification of Telephone Speech. IEEE Trans on Speech and Audio Processing, 1996, 4(1): 31-44.
  • 2Matejka P, Schwarz P, Cernocky′ J, et al. Phonotactic Language Identification Using High Quality Phoneme Recognition // Proc of the 9th European Conference on Speech Communication and Technology. Lisbon, Portugal, 2005: 2237-2240.
  • 3Torres-Carrasquillo P A, Singer E, Kohler M A, et al. Approaches to Language Identification Using Gaussian Mixture Models and Shi-fted Delta Cepstral Features // Proc of the 7th International Confe-rence on Spoken Language Processing. Denver, USA, 2002: 89-92.
  • 4Burget L, Matejka P, Cernocky J. Discriminative Training Techniques for Acoustic Language Identification // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Toulouse, France, 2006, I: 209-212.
  • 5Qu D, Wang B X. Discriminative Training of GMM for Language Identification[EB/OL]. [2014-11-01]. http://isca-speech.org/archive_open/archive_papers/sspr2003/sspr_map8.pdf.
  • 6Vair C, Colibro D, Castaldo F, et al. Channel Factors Compensation in Model and Feature Domain for Speaker Recognition // Proc of the IEEE Speaker and Language Recognition Workshop. San Juan, USA, 2006. DOI: 10.1109/ODYSSEY.2006.248117.
  • 7Hubeika V, Burget L, Matejka P, et al. Discriminative Training and Channel Compensation for Acoustic Language Recognition // Proc of the 9th Annual Conference of the International Speech Communication Association. Brisbane, Australia, 2008: 301-304.
  • 8Dehak N, Kenny P, Dehak R, et al. Front-End Factor Analysis for Speaker Verification. IEEE Trans on Audio, Speech and Language Processing, 2011, 19(4): 788-798.
  • 9Dehak N, Torres-Carrasquillo P A, Reynolds D A, et al. Language Recognition via Ivectors and Dimensionality Reduction // Proc of the 12th Annual Conference of the International Speech Communication Association. Florence, Italy, 2011: 857-860.
  • 10Song Y, Jiang B, Bao Y B, et al. I-vector Representation Based on Bottleneck Features for Language Identification. Electronics Le-tters, 2013, 49(24): 1569-1570.

同被引文献27

引证文献7

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部