摘要
声学模型自适应算法研究目的是缓解由测试数据和训练数据不匹配而引起的识别性能下降问题.基于深度神经网络(DNN)模型框架的自适应技术中,重训练是最直接的方法,但极容易出现过拟合现象,尤其是自适应数据稀疏的情况下.文章针对领域相关的自动语音识别任务,对典型的两种声学模型自适应算法进行了尝试,实验了基于线性变换网络的自适应方法和基于相对熵正则化准则的自适应方法,并对两种算法进行了详尽的系统性能比较.结果表明,在不同的自适应数据量下,相对熵正则化自适应方法均能表现出较好的性能.
Acoustic model adaptation algorithm aims at reducing the recognition performance degradation caused by the mismatch between training and testing data. Among the many adaptation techniques based on deep neural net- work (DNN), retraining is the most straightforward way. However it is prone to over-fitting, especially when adap- tation data is sparse. In this paper, two typical acoustic adaptation methods, namely linear transformation network adaptation and Kullback-Leibler divergence regularization adaptation, are experimentally explored for task- adaptation purpose. An elaborate comparison is made, and results show that KL divergence regularization technique achieves better performance under different amounts of adaptation data.
出处
《天津大学学报(自然科学与工程技术版)》
EI
CAS
CSCD
北大核心
2015年第9期765-770,共6页
Journal of Tianjin University:Science and Technology
基金
国家高技术研究发展计划(863计划)资助项目(2012AA012503)
中国科学院战略性先导科技专项(XDA06030100,XDA 06030500)
国家自然科学基金资助项目(11461141004,91120001,61271426)
中科院重点部署资助项目(KYGD-EW-103-2)
关键词
声学模型自适应
语音识别
深度神经网络
acoustic model adaptation
speech recognition
deep neural network (DNN)