期刊文献+

基于k均值和基于归一化类内方差的语音识别自适应聚类特征提取算法 被引量:6

Speech recognition adaptive clustering feature extraction algorithms based on the k-means algorithm and the normalized intra-class variance
原文传递
导出
摘要 语音识别模型中帧间独立假设在给模型计算带来简洁的同时,不可避免地降低了模型精度,增加了识别错误。该文旨在寻找一种既能满足帧间独立假设又能保持语音信息的特征。分别提出了基于k均值和基于归一化类内方差的语音识别自适应聚类特征提取算法,可以自适应地实现聚类特征流的提取。将该自适应特征分别应用在Gauss混合模型-隐Markov模型、基于段长分布的隐Markov模型和上下文相关的深度神经网络模型这3种语音识别模型中,与基线系统进行了实验对比。结果表明:采用基于归一化类内方差的自适应特征可以使得3种语言模型的识别错误率分别相对下降10.53%、5.17%和2.65%,展示了语音自适应聚类特征的良好性能。 The inter-frame independence assumption for speech recognition simplifies the computations. However, it also reduces the model accuracy and can easily give rise to recognition errors. Therefore, the objective of this paper is to search for a feature which can weaken the inter-frame dependence of the speech features and keep as much information of the original speech as possible. Two speech recognition feature extraction algorithms are given based on the k-means algorithm and the normalized intra-class variance. These algorithms provide adaptive clustering feature extraction. Speech recognition tests with these algorithms on a Gaussian mixture model-hidden Markov model (GMM-HMM), a duration distributionbased HMM (DDBHMM), and a context dependent deep neural network HMM (CD-DNN-HMM) show that the adaptive feature based on the normalized intra-class variance reduces the relative recognition error rates by 10.53%, 5.17%, and 2.65% relative to the original features. Thus, this adaptive clustering feature extraction algorithm provides improved speech recognition.
作者 肖熙 周路
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2017年第8期857-861,共5页 Journal of Tsinghua University(Science and Technology)
基金 国家自然科学基金面上项目(61374120)
关键词 特征提取 自适应聚类特征 帧间独立假设 归一化类内方差 feature extraction adaptive clustering feature assumption of inter-frame independence normalized intra-class variance
  • 相关文献

同被引文献49

引证文献6

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部