摘要
考虑连续语音中的协同发音问题,提出基于词内扩展的单流上下文相关三音素动态贝叶斯网络(SS-DBN-TRI)模型和词间扩展的单流上下文相关三音素DBN(SS-DBN-TRI-CON)模型。SS-DBN-TRI模型是Bilmes提出单流DBN(SS-DBN)模型的改进,采用词内上下文相关三音素节点替代单音素节点,每个词由它的对应三音素单元构成,而三音素单元和观测向量相联系;SS-DBN-TRI-CON模型基于SS-DBN模型,通过增加当前音素的前音素节点和后音素节点,构成一个新的词间扩展的三音素变量节点,新的三音素节点和观测向量相联系,采用高斯混合模型来描述,采用数字连续语音数据库的实验结果表明:SS-DBN-TRI-CON具备最好的语音识别性能。
To accurately capture the variations of real speech spectra,two single stream Dynamic Bayesian Network(DBN) models based on context-dependent triphone:SS-DBN-TRI model and SS-DBN-TRI-CON model,are proposed for continuous speech recognition.SS-DBN-TRI model is an augmentation of Single Stream DBN (SS-DBN) model proposed by Bilmes,the phone vari- able is replaced by triphone variable generated by inter-word;simultaneously,based on SS-DBN model,a previous phone node and a next phone node of current phone are added,resulting in a new triphone node to describe co-articulary of continuous speech inter-word,new triphone node is associated with observation,with some probabilities modeled by Gaussian Mixture Model. Experiment is done on continuous digit audio database,results show that:SS-DBN-TRI-CON model has the best performance in word recognition.
出处
《计算机工程与应用》
CSCD
北大核心
2007年第35期35-38,共4页
Computer Engineering and Applications
基金
中国科技部与比利时弗拉芒大区科技合作项目(No.[2004]487)
西北工业大学英才培养计划项目(No.04XD0102)。
关键词
动态贝叶斯网络
语音识别
三音素
单音素
上下文相关
Dynamic Bayesian Network(DBN)
speech recognition
triphone
mono-phone
context-dependent