摘要
为了提升连续语音识别系统的识别率,提出一种基于深度信念网络的Bottleneck特征提取方法。该方法使用对比散度算法,采用无监督的预训练堆叠限制玻尔兹曼机得到网络初始化参数,进而采用反向传播算法,以最大化帧级交叉熵作为训练准则,反向迭代对网络参数进行微调。采用上下文相关的三音素模型,以音素错误率大小作为评价系统性能的准则。实验结果表明,所提出的基于深度信念网络提取的Bottleneck特征相对于传统特征更具优越性。
In order to improve the speech recognition rate, a Bottleneck feature extraction method based on deep belief net work is proposed. The unsupervised pre-training stacking restricted Boltzmann machine is used to obtain network initializa tion parameters by using the contrastive divergence algorithm. And then the back propagation algorithm is adopted, the frame level cross entropy is maximized as the training criterion, the inverse iteration is used to fine tune the network parame- ters. The context dependent triphone model is adopted to get the better features. The phone error rate is used to evaluate the performance of the system. Experimental results show that the Bottleneck feature is better than the traditional features.
出处
《桂林电子科技大学学报》
2016年第2期118-122,共5页
Journal of Guilin University of Electronic Technology
基金
广西自然科学基金(2012GXNSFAA053221)
广西千亿元产业产学研用合作项目(信科院0618)