多分类器区分性组合在二次解码中的应用

Discriminative combination of multiple local classifiers in lattice rescoring

下载PDF

导出

摘要提出利用基于隐马尔可夫模型的谱特征模型、基于高斯混合模型的声调分类器以及基于多层感知器的音素分类器模型的组合来提高语音识别中二次解码中的识别率。在模型组合中,使用上下文相关的模型权重加权模型得分,并使用区分性训练来优化上下文相关权重来进一步改进识别结果。对人工选取各种上下文相关权重集合进行了性能评估,连续语音识别实验表明,使用局部分类器进行二次解码能够明显降低系统误识率。在模型组合中,使用当前音节类型及左上下文相结合的模型权重集合能够最大程度降低系统误识率。实验表明该方法得到的识别结果优于基于谱特征与基频特征和音素后验概率特征合并得到特征组合的识别系统。 The combination of the hidden Markov model based spectral acoustic model, multi-layer perceptron based phoneme classifier and Gaussian mixture model based tone classifier in lattice rescoring is proposed.Moreover, discriminative model weight training is applied to tune the impact of the heterogeneous models according to different phonetic contexts for better model interpolation.Experimental results on continuous speech recognition show significant improvement can be obtained using the combination of the models.Four context dependent weighting schemes for discriminative trained scaling factors are evaluated.It is also shown introducing left contexts can obtain the best recognition accuracy.Results have also shown tree based model combination is superior to the system based on feature space combination.

作者黄浩李兵虎

机构地区新疆大学信息科学与工程学院多语种信息技术实验室

出处《计算机工程与应用》 CSCD 北大核心 2011年第32期163-166,共4页 Computer Engineering and Applications

基金国家自然科学基金No.60965002 新疆高校科研计划培育基金(No.XJEDU2008S15) 新疆大学博士科研启动基金(No.BS090143)~~

关键词区分性模型组合语音识别多层感知器区分性训练 discriminative model combination speech recognition multi-layer perceptron discriminative training

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献10

1Huang C H, Side EPitch Iracking and tone features for mandarin speech recognition[C]//Proceedings of International Conference on Acoustics, Speech and Signal Processing(ICASSP), 2000:1523-1526.
2Lei X, Siu M H, Hwang M, et al.lmproved tone modeling for Mandarin broadcast news speech recognition[C]//Proceedings of Interspeech, 2006:1277-1280.
3Wang H L, Qian Y, Soong F K, et al.Improved Mandarin speech recognition by lattice rescoring with enhanced tone models[C]//Proceedings of ISCSLP, 2006: 445-443.
4Ellis D P W, Singh R, Sivadas S.Tandem acoustic modeling in large-vocabulary recognition[C]//Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP) ,Salt Lake City,2001 : 1201-1204.
5Povey D, Woodland P C.Minimum phone error and I-smoothing for improved discriminative training[C]//Proceedings of Interna- tional Conference on Acoustics, Speech and Signal Processing (ICASSP),2002 : 105-108.
6Wong P F, Siu M H.Decision tree based tone modeling for Chinese speech recognition[C]//Proceedings of International Conference on Acoustics,Speech and Signal Processing(ICASSP),2004:905-908.
7Huang H, Zhu J.Discfiminative incorporation of explicitly trained tone models into lattice based rescoring for Mandarin speech recognition[C]//Proceedings of International Conference on Acoustics, Speech and Signal Processing(ICASSP), 2008:1541-1544.
8Chang E,Shi Y,Zhou J L,et al.Speech lab in a box:a Manda- rin speech toolbox to jumpstart speech related research[C]//Pro- ceedings of Eurospeech,2001:2779-2782.
9Young S.The HTK book(for version 3.4)[M].Cambridge: Cam- bridge University Press, 2009.
10The ICSI Quicknet tools[EB/OL].www.icsi.berkeley.edu/Speech/ icsi-speech-tools.html.

1黄浩,李兵虎,吾守尔.斯拉木.汉语语音识别声调模型集成中基于决策树的上下文相关权重参数聚类方法[J].新疆大学学报（自然科学版）,2011,28(3):260-266.
2秦大淼.卡尔曼滤波和神经网络组合模型的研究[J].电脑知识与技术（过刊）,2010,0(14):3722-3723.
3吕振斌,王惠南.一种H.264基本单元层的码率控制算法[J].电视技术,2009,33(S1):45-47. 被引量：5
4杨士英,罗景青.对海面远距离运动目标定位跟踪的IMM算法研究[J].电光与控制,2008,15(5):7-11. 被引量：3
5薛春玲,李然,朱秀昌.基于多特征匹配的双向运动估计帧率提升算法[J].电视技术,2015,39(1):19-23. 被引量：2
6夏定元,卢姗,吴通.一种基本单元层码率控制的改进算法[J].中国图象图形学报,2008,13(10):2011-2014. 被引量：2
7尹瑞,王荫槐,王峰.交互式多模型机动目标跟踪方法的仿真[J].现代雷达,2007,29(7):52-54. 被引量：8
8倪崇嘉,张爱英,刘文奇.基于直方图势函数做标记和纹理特征合并的分水岭算法[J].计算机工程与科学,2005,27(11):28-30.
9谢达东,吴及,王作英.线性判别分析在汉语语音识别中的应用[J].计算机工程与应用,2002,38(23):1-2. 被引量：2
10于明,邳艳芹.一种改进的显著性区域提取模型[J].电视技术,2012,36(19):167-169. 被引量：2

计算机工程与应用

2011年第32期

浏览历史

内容加载中请稍等...

多分类器区分性组合在二次解码中的应用

参考文献10

相关作者

相关机构

相关主题

浏览历史