摘要
提出了面向关键词检索的连续语音单词切分算法.算法的核心是一组多元高斯分布,其5个参数分别由语音信号的短时能量、短时过零率、短时自相关系数、第一个预测系数、预测误差归一化能量等声学特征获取.证明了基于单词声学特征的连续语音单词切分比等宽单词切分具有更好的单词切分及检索效率;讨论了优化算法的构思.
This paper introduces continues speech word segmentation algorithm used for key-word spotting. The core of the algorithm is based on a multivariate Gaussian distribution, five parameters can be extracted by the short-time log energy, the short-time zero-crossing rate, the short-time autocorrelation coefficient, the first predictor coefficient of LP and the normalized energy of the prediction error of a LP. The test outcomes show that the word segmentation algorithm based on Uyghur phoneme acoustic feature is better than mono-spaced word segmentation. Finally, some ideas to optimize algorithm are given.
出处
《西北师范大学学报(自然科学版)》
CAS
北大核心
2013年第4期34-37,共4页
Journal of Northwest Normal University(Natural Science)
基金
新疆多语种信息技术重点实验室资助项目(049807)
新疆维吾尔自治区高校科研计划项目(XJEDU2012S46)
关键词
语音识别
关键词检索
声学特征
清音
浊音
speech recognition
keyword spotting
acoustic feature
voiced speech
unvoiced speech