鲁棒语音识别技术综述被引量：4

Review of robust speech recognition

下载PDF

导出

摘要鲁棒语音识别是为了解决噪声环境所引起的语音识别系统识别和训练不匹配的情况.依据噪声对语音识别系统的影响,从信号空间、特征空间及模型空间3个层面上分别对语音增强技术、特征增强技术及语音模型补偿、增强技术进行了总结,并分析了不同方法的特点、实现及应用. To solve the mismatch between the training and recognition environment, some robust speech recognition methods were proposed. Based on the influence of noise on Automatic Speech Recognition （ASR） system, some classified and summarized robust speech recognition technologies in the aspects of speech enhancement, feature enhancement and model compensation ／ enhancement aiming at the signals space, feature space and model space of ASR system were presented in this paper. Furthermore, some main ideas of these approaches were analyzed.

作者吕钊吴小培张超

机构地区安徽大学计算智能与信号处理教育部重点实验室

出处《安徽大学学报（自然科学版）》 CAS 北大核心 2013年第5期17-24,共8页 Journal of Anhui University(Natural Science Edition)

基金国家自然科学基金资助项目(61271352) 安徽大学校学术与技术带头人引进工程基金资助项目(02303203)

关键词鲁棒语音识别语音增强特征增强语音模型补偿增强 robust speech recognition speech enhancement feature enhancement model compensation/enhancement

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献22

1Dautrich B, Rabiner L, Martin T. On the effects of varying filter bank parameters on isolated word recognition [ J ]. Acoustics, Speech and Signal Processing, IEEE Transactions on, 1983,31 (4) :793-807.
2Lockwood, P A. Experiments with a nonlinear spectral subtract and hidden Markov models and the projection for robust speech recognition in cars [ J ]. Speech Communication, 1992,11 (2/3) :215-228.
3Das S, Bakis R, N6das A, et al. Influence of background noise and microphone on the performance of the IBM TANGORA speech recognition system[ C ]. Acoustics, Speech, and Signal Processing, 1993 ICASSP-93,1993 IEEE International Conference on IEEE, 1993,2:71-74.
4Preuss R. A frequency domain noise cancelling preprocessor for narrowband speech communications systems [ C ]. Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP"/9 IEEE, 1979,4:212-215.
5Berouti M, Schwartz R, Makhoul J. Enhancement of speech corrupted by acoustic noise [ C 1. Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP'79IEEE, 1979,4:208-211.
6Agarwal A, Cheng Y M. Two-stage mel-warped Wiener filter for robust speech recognition[ C ]. Proc ASRU, 1999, 99:67-70.
7Lira J, Oppenheim A. All-pole modeling of degraded speech[ J]. Acoustics, Speech and Signal Processing, IEEE Transactions on, 1978,26 ( 3 ) : 197-210.
8Musicus B, Lira J. Maximum likelihood parameter estimation of noisy data [ C ]. Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP79 IEEE, 1979,4:224-227.
9Hansen J H, Clements M A. Constrained iterative speech enhancement with application to speech recognition [ J ]. Signal Processing, IEEE Transactions on, 1991,39 (4) :795-805.
10Goh Z, Tan K C, Tan B T. Speech enhancement based on avoieed-unvoiced speech model[ C ]. Acoustics, Speech and Signal Processing, 1998 Precedings of the 1998 IEEE International Conference on IEEE, 1998,1:401-404.

二级参考文献43

1包永强,赵力,邹采荣.采用归一化补偿变换的与文本无关的说话人识别[J].声学学报,2006,31(1):55-60. 被引量：13
2刘波,戴礼荣,王仁华,杜俊,李锦宇.基于双高斯GMM的特征参数规整及其在语音识别中的应用[J].自动化学报,2006,32(4):519-525. 被引量：4
3Moreno P J, Raj B, and Stern R M. A vector Taylor series approach for environment- independent speech recognition[C] Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Atlanta, Georgia, USA, 7-10 May 1996: 733-736.
4Moreno P J. Speech recognition in noisy environments[D]. [Ph.D. dissertation], Carnegie Mellon University, 1996.
5Sasou A, Asano F, and Nakamura S, et al.. HMM-based noise-robust feature compensation[J]. Speech Communication, 2006, 48(9): 1100-1111.
6Kim W and Hansen J H L. Feature compensation in the cepstral domain employing model combination[J]. Speech Communication, 2009, 51(2): 83-96.
7Gauvain J L and Lee C H. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains[J]. IEEE Transactions on Speech and Audio Processing 1994, 2(2): 291-298.
8Leggetter C J and Woodland P C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models[J]. Computer Speech and Language, 1995, 9(2): 171-185.
9Gales M J F and Woodland P C. Mean and variance adaptation within the MLLR framework[J]. Computer Speech and Language, 1996, 10(4): 249-264.
10Gales M J F and Young S J. Robust speech recognition in additive and convolutional noise using parallel model combination[J]. Computer Speech and Language, 1995, 9(4): 289-307.

共引文献8

1李燕萍,唐振民,丁辉,张燕.基于非参数直方图模型的鲁棒说话人识别算法[J].数据采集与处理,2010,25(1):81-85. 被引量：1
2许友亮,张连海,张文林,李永彬.基于语速调整和音位属性后验概率的音素识别[J].信号处理,2012,28(2):295-300. 被引量：5
3周阿转,俞一彪.采用特征空间随机映射的鲁棒性语音识别[J].计算机应用,2012,32(7):2070-2073. 被引量：5
4杨芳,木拉提.哈米提,严传波,阿布都艾尼.库吐鲁克,孙静,姚娟.基于SVM的新疆哈萨克族食管癌医学图像特征提取及分类研究[J].科技通报,2016,32(3):53-57. 被引量：4
5张勋,赵晓芳,时延利,赵圣芳.UUV海面红外侦察图像自适应归并直方图拉伸增强算法[J].应用科技,2017,44(6):1-4.
6李丹,贾桂敏,程方圆,杨金锋,郭晓静.陆空通话复诵语义自动化校验BiLSTM模型[J].信号处理,2019,35(1):57-64. 被引量：6
7宋文林,刘斌.智能语音识别系统噪声鲁棒性研究[J].信息技术与标准化,2019,0(6):40-42. 被引量：3
8王韵琪,张微,杨博云.噪声环境下基于自适应高斯混合模型的说话人识别[J].科技视界,2020(17):46-47.

同被引文献66

1何浩祥,王玮,黄磊.基于卷积神经网络和递归图的桥梁损伤智能识别[J].应用基础与工程科学学报,2020,28(4):966-980. 被引量：23
2赵天祺,勾红叶,陈萱颖,李文昊,梁浩,陈子豪,周思清.桥梁信息化及智能桥梁2020年度研究进展[J].土木与环境工程学报（中英文）,2021,43(S01):268-279. 被引量：14
3姜绍飞,张春梅,金子巍,牛德生,徐云良,张帅,邱云飞.基于BP神经网络和D-S证据理论的损伤识别方法[J].沈阳建筑大学学报（自然科学版）,2007,23(1):1-5. 被引量：6
4Meng J, Zhang J, Zhao H. Overview of the Speech Recognition Technology [A]. International Conference on Computational Information Sciences [C]. IEEE Computer Society, 2012 : 199 - 202.
5Saon G, Chien J T. Large-Vocabulary Continuous Speech Recog- nition Systems: A Look at Some Recent Advances ['J]. Signal Pro- cessing Magazine IEEE, 2012, 29 (6): 18-33.
6Haidar M A, O'Shaughnessy D. Topic n-gram count language model adaptation for speech recognition [A]. Spoken Language Technology Workshop (SLT), 2012 IEEE [C]. IEEE, 2012: 165 - 169.
7陈勇,屈志毅,刘莹,酒康,郭爱平,杨志国.语音特征参数MFCC的提取及其应用[J].湖南农业大学学报（自然科学版）,2009,35(10X):106-107. 被引量：11
8曲卫振,林宝军.调制域测频原理及工程实现[J].仪器仪表学报,1998,19(6):561-565. 被引量：5
9张磊,刘继芳,项学智.基于计算听觉场景分析的混合语音分离[J].计算机工程,2010,36(14):24-25. 被引量：2
10张帆,张海鹏.一种电子货架标签系统(ESLS)的设计[J].电讯技术,2010,50(12):95-100. 被引量：6

引证文献4

1胡丹,曾庆宁,龙超.调制域谱减法用于鲁棒性语音识别[J].科学技术与工程,2016,16(4):216-220. 被引量：5
2丁磊,蒋东国,王志韬.语音识别技术在电子货架标签系统中的应用[J].计算机测量与控制,2016,24(10):186-189. 被引量：1
3陈洁.背景音乐自动分离系统设计与实现[J].现代电子技术,2017,40(5):134-138. 被引量：2
4刘红波,张帆,陈志华,王龙轩.人工智能在土木工程领域的应用研究现状及展望[J].土木与环境工程学报（中英文）,2024,46(1):14-32. 被引量：4

二级引证文献12

1程小伟,王健,曾庆宁,谢先明,龙超.基于调制域谱减法的鲁棒性说话人识别[J].科学技术与工程,2017,17(3):252-257. 被引量：5
2赵丹,钟楠.在线连续交互式英语语音智能识别系统设计[J].现代电子技术,2017,40(15):137-140. 被引量：20
3胡宏梅,姜子祥.基于无线射频识别技术的超市快速结算系统的设计与实现[J].计算机测量与控制,2018,26(5):116-119. 被引量：4
4王瑶,曾庆宁,龙超,谢先明,毛维.低信噪比环境下语音端点检测改进方法[J].声学技术,2018,37(5):457-467. 被引量：14
5顾鸿虹.基于多核学习的多带抗噪声语音识别方法仿真[J].计算机仿真,2019,36(10):364-367. 被引量：3
6王振宇.语音和背景音乐分离算法及其系统设计[J].自动化技术与应用,2019,38(8):77-79. 被引量：1
7张丹.网络环境下英语语音在线自动评改系统设计[J].科技通报,2019,35(12):199-203. 被引量：3
8邓轩.基于谐和与击打声源分离相结合的音乐分离方法研究[J].自动化技术与应用,2021,40(8):66-69.
9杨健,张安山,庞博,张凯,鲍朱杰,李佳潼,王斐亮.元宇宙技术发展综述及其在建筑领域的应用展望[J].土木与环境工程学报（中英文）,2024,46(1):33-45. 被引量：2
10廖文杰,陆新征,黄羽立,赵鹏举,费一凡,郑哲.剪力墙结构智能化生成式设计方法:从数据驱动到物理增强[J].土木与环境工程学报（中英文）,2024,46(1):82-92. 被引量：2

1雷建军,杨震,刘刚,郭军.噪声鲁棒语音识别研究综述[J].计算机应用研究,2009,26(4):1210-1216. 被引量：13
2孙暐,吴镇扬.多带同步模型用于噪声环境下语音识别[J].中国工程科学,2006,8(3):31-34.
3宁更新,韦岗.一种用于抗噪语音识别的动态参数补偿新方法[J].电路与系统学报,2008,13(2):14-19.
4吴荣娣.基于直方图均衡的鲁棒性语音识别研究[J].科技信息,2010(24):132-132.
5张军,韦岗.基于相对自相关序列MFCC特征的模型补偿技术[J].信号处理,2003,19(3):284-286. 被引量：7
6牛铜,李弼程,张连杰.基于缺失数据补偿的鲁棒语音识别[J].信息工程大学学报,2012,13(4):411-415.
7吕勇,吴镇扬.基于矢量泰勒级数的鲁棒语音识别[J].天津大学学报,2011,44(3):261-265. 被引量：4
8张军,韦岗.一种基于鲁棒特征的模型补偿噪声语音识别方法[J].数据采集与处理,2003,18(3):249-252.
9何勇军,付茂国,孙广路.语音特征增强方法综述[J].哈尔滨理工大学学报,2014,19(2):19-25. 被引量：3
10吴枫,高鹏,高文.基于模型的编码[J].计算机学报,1999,22(12):1239-1245. 被引量：3

安徽大学学报（自然科学版）

2013年第5期

浏览历史

内容加载中请稍等...

鲁棒语音识别技术综述被引量：4

参考文献22

二级参考文献43

共引文献8

同被引文献66

引证文献4

二级引证文献12

相关作者

相关机构

相关主题

浏览历史

鲁棒语音识别技术综述 被引量：4

参考文献22

二级参考文献43

共引文献8

同被引文献66

引证文献4

二级引证文献12

相关作者

相关机构

相关主题

浏览历史

鲁棒语音识别技术综述被引量：4