摘要
为了克服低数据资源条件下的资源匮乏问题,该文利用无监督的声学模型训练方法来增加训练数据,改善系统性能。在标准的无监督训练框架下,在传统词图后验概率的词置信度基础上,提出了基于句子后验概率的置信度数据筛选准则,所选数据在保证整句话可靠性的同时很好保留了上下文信息,有利于跨词的三音子声学模型建模;还提出了基于音素覆盖率准则的数据筛选方法,在考虑假设标注句子置信可靠度的同时,尽可能选取训练样本中最为稀疏的音素单元,从源头再次克服低数据资源的困难,数据选择效率更高,性能进一步提升。实验表明:基于本文改进的无监督训练方法的词错误率比基线有监督训练方法的降低约相对8%,比传统无监督方法的也有绝对2%的减少,极大程度改善了低数据资源条件下的系统性能。
Unsupervised acoustic model training is used here to enlarge the training data and improve low-resource speech recognition. The traditional word-level posterior confidence was modified to create an utterance-level posterior confidence to select more useful hypotheses in the unsupervised training. The utterance level confidence strategy retains the sentence context information while ensuring the reliability of the hypotheses and is better for cross-word acoustic modeling than word level data selection. This paper also presents a phone frequency based data selection method. This approach selects low frequency data in the original data set in priority to relieve the low-resource problem. Tests show that this unsupervised approach has an 80//oo better relative WER than supervised baseline systems and about 2% absolute improvement over the normal unsupervised training method. Thus, the approach significantly improves speech recognition performance in low-resource scenarios
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2013年第7期1001-1004,1010,共5页
Journal of Tsinghua University(Science and Technology)
基金
国家自然科学基金资助项目(60931160443,61273268,90920302)
国家科技支撑计划项目(2009BAH41B01)
关键词
语音识别
低数据资源
无监督训练
数据选择
speech recognition
low data resource
unsupervised training
data selection