噪声环境下畸变模型线性化处理的顽健语音识别方法

Linearized distortion model for robust speech recognition in noisy environments

下载PDF

导出

摘要针对噪声环境下语音识别的顽健性问题,考虑到梅尔倒谱系数(MFCC,Mel-frequency cepstral coefficient)域的畸变模型高度非线性且难以处理,用分段线性插值函数代替对数函数,提出了一种新的线性畸变模型。在此基础上,导出了噪声参数估计和声学模型补偿方法,无需采用矢量泰勒级数(VTS,vector Taylor series)展开作近似处理,有效避免了模型误差的引入,增强了系统在噪声环境下的顽健性。 The robustness of speech recognition system in noisy environments was investigated.The distortion model in Mel-frequency cepstral coefficient（MFCC） domain is highly non-linear and difficult to deal with.A new linear distortion model was proposed by replacing the logarithm operation with its piecewise linear interpolation function.Then the esti-mation of noise parameters and compensation of acoustic models were provided.The proposed method can avoid model error introduced by utilizing linearization methods based on vector Taylor series（VTS） expansion,and significantly im-prove the robustness of recognizer in noisy environments.

作者何勇军韩纪庆

机构地区哈尔滨工业大学计算机科学与技术学院哈尔滨理工大学计算机科学与技术学院

出处《通信学报》 EI CSCD 北大核心 2010年第9期8-14,共7页 Journal on Communications

基金国家高技术研究发展计划("863"计划)基金资助项目(2006AA010103) 国家重点基础研究发展计划("973"计划)基金资助项目(2007CB311100)~~

关键词语音识别顽健性畸变模型线性化 speech recognition robustness distortion model linearization

分类号 TN912 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献14

1YUSUKE S,MASANMI A.Bayesian feature enhancement using mixture of unscented transformations for uncertainty decoding of noisy speech[A].Proceedings of ICASSP[C].Taiwan,China,2009.4569-4572.
2ACERO A,DENG L,KRISTJANSSON T,et al.HMM adaptation using vector Taylor series for noisy speech recognition[A].Proceed-ings of ICSLP[C].Beijing,China,2000.869-872.
3GONG Y F.A method of joint compensation of additive and convolu-tive distortions for speaker-independent speech recognition[J].IEEE Transaction on Speech Audio Processing,2005,13(5):975-983.
4LI J Y,DENG L,YU D.A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions[J].Computer Speech and Language,2009,23(3):389-405.
5VAN D,GALES M.Extended VTS for noise-robust speech recogni-tion[A].Proceedings of ICASSP[C].Taiwan,China,2009.3829-3832.
6GALES M,FLEGO F.Combining VTS model compensation and support vector machines[A].Proceedings of ICASSP[C].Taiwan,China,2009.3821-3824.
7LIAO H,GALES M.Joint Uncertainty Decoding for Robust Large Vocabulary Speech Recognition[R].Technical Report CUED/TR552.University of Cambridge,2006.
8KING-ASR-009.A Chinese speech database for speech recogni-tion[EB/OL].http://www.speechocean.com/productdetail.asp?id=King-ASR-009,2010.
9STEVEN F B,DENNIS C P.Feature and score normalization for speaker verification of cellular data[A].Proceedings of ICASSP[C].Hong Kong,China,2003.49-52.
10HERMANSKY H,MORGAN N,BAYYA A.RASTA-PLP speech analysis technique[A].Proceedings of ICASSP[C].San Francisco,USA,1992.1121-1124.

1崔洪州,孔渊,周起勃,潘兆鑫,葛军.基于畸变率的图像几何校正[J].应用光学,2006,27(3):183-185. 被引量：30
2吴海洋,杨飞然,周琳,吴镇扬.矢量泰勒级数特征补偿的说话人识别[J].声学学报,2013,38(1):105-112. 被引量：6
3潘海琦,杨震,徐珑婷,朱俊华.一种基于压缩感知的说话人识别参数分析[J].数据采集与处理,2015,30(2):399-407. 被引量：2
4陈砚圃,戴启军,卞正中.含噪语音信号中噪声参数的一种估计方法[J].西安交通大学学报,2001,35(10):1096-1098. 被引量：3
5牛建军,刘上乾,姚荣辉,郑毅,冀芳.高准确度光电成像测量系统图像畸变校正算法[J].光子学报,2006,35(9):1317-1320. 被引量：21
6茹国宝,黄燕,郭英杰,甘良才.基于对数函数的新变步长LMS算法[J].武汉大学学报（理学版）,2015,61(3):295-298. 被引量：16
7彭双春,刘光斌,刘冬.干扰条件下的景象匹配模型分析[J].盐城工学院学报（自然科学版）,2004,17(4):47-49. 被引量：1
8黄应清,文军,谢志宏.摄像机畸变的非量测校正方法研究[J].现代电子技术,2015,38(20):59-62. 被引量：2
9熊伟,罗云贵.语音识别的MFCC算法研究[J].现代商贸工业,2010,22(3):291-292. 被引量：2
10骆启帆,章坚武,吴震东.一种基于MFCC与韵律特征的说话人确认方法[J].杭州电子科技大学学报（自然科学版）,2013,33(5):134-137.

通信学报

2010年第9期

浏览历史

内容加载中请稍等...

噪声环境下畸变模型线性化处理的顽健语音识别方法

参考文献14

相关作者

相关机构

相关主题

浏览历史