摘要
针对多候选汉语音节网格语音关键词检索任务,在Gauss混合模型以及多候选识别算法方面进行了研究改进。首先探讨了Gauss混合模型的不同简化策略并用实验进行了验证,证明了全协方差矩阵在识别性能上的优越性;随后对经典的多候选令牌传递算法做出了针对汉语特点的改进。实验表明这2方面的研究不仅提高了以音节作为输出的语音识别引擎的单候选识别效果,也大幅提高了多候选的识别性能。最后搭建了一个基于多候选网格的语音关键词检索系统,在该系统中验证了上述改进的效果。
An improved lattice-based speech keyword spotting system was developed from the Gaussian mixture model and an improved N-best speech recognition algorithm.First,tests were used to evaluate different simplified structures of Gaussian mixture models.Then,an N-best token passing algorithm was developed from the classic token passing algorithm using some unique pronunciation rules for the Chinese language.These two modifications improve the performance of both the 1-best and N-best speech recognition candidates.Finally,a key word spotting system was developed based on an N-best lattice to show the effectiveness of these improvements.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2015年第5期508-513,共6页
Journal of Tsinghua University(Science and Technology)
关键词
语音关键词检索
多候选网格
Gauss混合模型
CUDA
三音子模型
speech keyword spotting
multi-candidate lattice
Gaussian mixture model
compute unified device architecture(CUDA)
triphone model