期刊文献+

一种新的基于DBN的声学特征提取方法

A New Feature Extraction Method Based on Bottleneck Deep Belief Network
下载PDF
导出
摘要 大词汇量连续语音识别系统中,为了进一步增强网络的鲁棒性、提升深度置信网络的识别准确率,提出一种基于区分性和ODLR自适应瓶颈深度置信网络的特征提取方法。该方法首先使用鲁棒性较强的瓶颈深度置信网络进行初步特征提取,进而进行区分性训练,使网络的区分性更强、识别准确率更高,在此基础上引入说话人自适应技术对网络进行调整,提高模型的鲁棒性。利用提出的声学特征在多个噪声较强、主题风格较为随意的多个公共连续语音数据库上进行了测试,识别结果取得了22.2%的提升。实验结果表明所提出的特征提取方法有效性。 In order to further improve the robustness and recognition rate of deep belief network in Large Vocabulary Continuous Speech Recognition system,this paper presented a novel bottleneck deep belief network to extract new features, which was based on speaker adaptation and discriminative training.Firstly, a bottleneck deep belief network was adopted to get the feature.And discriminative training performed on this basis gave a more distinguished network to improve the recognition accuracy.Simultaneously,a more robust speaker adaptation method was introduced to adjust the network.The proposed method was tested on several public continuous speech databases with strong noise and casual themes and a relative 6.9% promotion of the recognition accuracy was obtained.The result proves the superiority of the proposed method compared to the conventional one.
出处 《无线电通信技术》 2015年第6期41-45,共5页 Radio Communications Technology
基金 国家自然科学基金项目(60872113)
关键词 连续语音识别 瓶颈深度置信网络 区分性训练 ODLR Continuous Speech Recognition Bottleneck Deep Belief Network Discriminative Training ODLR
  • 相关文献

参考文献9

  • 1Mohamed A,Dahl G, Hinton G.Acoustic Modeling Using Deep Belief Networks [ J ] .IEEE Transactions on Audio, Speech, and Language Processing, 2012,20 ( 1 ) : 14- 22.
  • 2Mohamed A, Sainath T, Dahl G, et al. Deep Belief Networks Using Discriminative Features for Phone Recog- nition [ C ]// Proceedings of the IEEE International Con- ference on Acoustics, Speech, and Signal Processing. 2011, Prague, Cech Republic, 2011 : 5060-5063.
  • 3Sainath T, Kingsbury B, Ramabhadran B. Auto-Encoder Bottleneck Features using Deep Belief Networks [ C ] //Proceedings of the IEEE International Conference on A- coustics, Speech, and Signal Processing, Kyoto, Japan. 2012:4153-4156.
  • 4Vahchev V,Odell J J,Woodl P C.Lattice-Based Discrimi- native 1 Yaining for Large Vocabulary Speech Recognition [ C]// Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing ( ICASSP ) , 1996 (2) : 605-608.
  • 5Kingsbury B.Lattice-based Optimization of Sequence Clas- sification Criteria for Neural-Network Acoustic Modeling [ C]// Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing ( ICASSP ) , 2009 : 3761-3764.
  • 6Hinton G, Srivastava N, Krizhevsky A, et al. Improving Neural Networks by Preventing Co-adaptation of Feature Detectors[ C]// CoRR,2012: 1207-1210.
  • 7Kuhn R,Junqua J C, Nguyen P, et al. Rapid Speaker Ad- aptation in Eigenvoice Space [ J ]. IEEE Transactions on Speech and Audio Processing, 2000,8 (6) : 695- 707.
  • 8Siniscalchi S M, Dong Yu, Li Deng, et al.Speech Recogni- tion Using Long-Span Temporal Patterns in a Deep Network Mode[ J] .IEEE Signal Processing Letters, 2013 : 20(3) :201-204.
  • 9BaoYebo,Jiang Hui ,Liu Cong ,et al.Investigation on Di- mensionality Reduction of Concatenated Features with Deep Neural Network for LVCSR Systems [ C ] // Pro- ceedings of the IEEE llth International Conference on Signal Processing (ICSP2012), Beijing, China, 2012: 562-566.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部