摘要
目前,汉语识别已经取得了一定的研究成果.但由于中国的地域性差异,十里不同音,使得汉语识别系统在进行方言识别时识别率低、性能差.针对语音识别系统在对方言进行识别时的缺陷,构建了基于HTK的衡阳方言孤立词识别系统.该系统使用HTK3.4.1工具箱,以音素为基本识别单元,提取39维梅尔频率倒谱系数(MFCC)语音特征参数,构建隐马尔可夫模型(HMM),采用Viterbi算法进行模型训练和匹配,实现了衡阳方言孤立词语音识别.通过对比实验,比较了在不同因素模型下和不同高斯混合数下系统的性能.实验结果表明,将39维MFCC和5个高斯混合数与HMM模型结合实验时,系统的性能得到很大的改善.
At present, Chinese speech recognition has made some achievements. However, due to regional differences in China, different place has different dialect, the Chinese recognition system has low recognition rate and poor performance in the dialect recognition. In order to solve the shortcomings of speech recognition system in dialect recognition, an isolated word recognition system of Hengyang dialect based on HTK is proposed. This method constructs the Hidden Markov Models (HMM), using phoneme as the basic recognition unit and using the HTK3.4.1 toolbox to extract the speech feature parameters of 39-dimensional Mel frequency cepstral coefficients (MFCC). Viterbi algorithm is used to train and match the model to achieve the isolated word speech recognition system of Hengyang dialect. The system's performances are compared under the different phoneme models and different Gaussian mixture numbers. The experimental results show that the system performance can be greatly improved by combining the 39-dimensional MFCC with 5 Gauss mixed numbers and HMM model.
出处
《计算机系统应用》
2017年第5期247-252,共6页
Computer Systems & Applications
关键词
HTK
隐马尔可夫模型
衡阳方言
梅尔频率倒谱系数
VITERBI算法
Hidden Markov Model Toolkit(HTK)
Hidden Markov ModeI(HMM)
Hengyang dialect
Mel FrequencyCepstral Coefficients (MFCC)
Viterbi algorithm