采用特征空间随机映射的鲁棒性语音识别被引量：5

Robust speech recognition by adopting random projection in feature space

下载PDF

导出

摘要针对语音识别性能受噪声干扰而显著降低的问题,提出一种采用特征空间随机映射(RP)的鲁棒性语音语音识别方法,并应用于汽车驾驶环境下的语音识别系统。首先,将原始语音特征参数采用随机矩阵线性映射到新的特征空间,使新的特征参数以最大概率保持原始特征之间距离的同时更加接近于高斯分布;然后训练隐马尔可夫模型(HMM),测试时结合多数投票表决方法对初始模式匹配结果进行判决并得到最终语音识别结果。采用日本情报处理学会车载环境下语音识别数据库CENSREC-2进行实验分析,结果表明,随机映射特征使得汽车驾驶环境下的语音识别性能有了很大改善。 To improve speech recognition in noisy environment such as in driving car,a new method which adopted Random Projection（RP） of feature space was proposed in this paper.First,original speech feature coefficients were projected into a new feature space using random matrixes to make the new coefficients have distribution more similar to the Gaussian but preserve the original distances among features with maximum probability.Then Hidden Markov Model（HMM） of every word was trained.In the test stage,the initial pattern matching results were further processed with majority voting strategy then to make a final speech recognition decision.The experimental results based on speech recognition database CENSREC-2 of Japan Information Processing Association demonstrate the effectiveness of random projection of feature space,which greatly improves the speech recognition performance in driving car.

作者周阿转俞一彪

机构地区苏州大学语音技术研究室

出处《计算机应用》 CSCD 北大核心 2012年第7期2070-2073,2081,共5页 journal of Computer Applications

关键词语音识别随机映射多数投票表决 CENSREC-2 speech recognition Random Projection（RP） majority voting CENSREC-2

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献17

1HUANG LIANG-SHENG, YANG C-H. A novel approach to robust speech endpoint detection in car environments [ C]//IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing. Pis- cataway: IEEE, 2000:1751-1754.
2MA LONGHUA, WEI SHANGGUAN, ZANG YIHUA. Design of speech control system in car noise environments [ C]//2007 Inter- national Conference on Mechatronics and Automation. Piscataway: IEEE, 2007:3475 -3480.
3AFIFY M, SIOHAN O. Sequential estimation with optimal forgetting for robust speech recognition [ J]. IEEE Transactions on Speech and Audio Processing, 2004, 12(1) : 19 - 26.
4LI WEIFENG, ITOU K, TAKEDA K, et al. Adaptive regression based framework for in-car speech recognition [ C]//2009 IEEE In- ternational Conference on Acoustics, Speech, and Signal Process- ing. Piscataway: IEEE, 2006: 14- 19.
5姜莹,俞一彪.采用特征分类直方图均衡化的鲁棒性语音识别[J].信号处理,2011,27(6):896-900. 被引量：4
6MORENO P J, RAJ B, STERN R M. A vector Taylor series ap- proach for environment-independent speech recognition [ C]//1995 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 1995:733-736.
7GALES M J F, YOUNG S J. Robust continuous speech recognition using parallel model combination [ J]. IEEE Transactions on Speech and Audio Processing, 1996, 4(5): 352 -359.
8ABOLHASSANI A H, SELOUANI S A, O'SHAUGHNESSY D. Speech enhancement using PCA and variance of the reconstruction error in distributed speech recognition [ C]//2007 IEEE Workshop on Automatic Speech Recognition and Understanding. Piscataway: IEEE, 2007: 19-23.
9KAJAREKAR S S, YEGNANARAYANA B, HERMANSKY H. A study of two dimensional linear discriminants for ASR [ C]// IEEE International Conference on Acoustics, Speech, and Signal Process- ing. Piscataway: IEEE, 2001:137-140.
10HYUNSIN P, TAKIGUCHI T, ARIKI Y. Integration of phoneme- subspaces using ICA for speech feature extraction and recognition [ C]// HSCMA 2008: Hands-Free Speech Communication and Mi- crophone Arrays. Piscataway: IEEE, 2008:148-151.

二级参考文献9

1刘波,戴礼荣,王仁华,杜俊,李锦宇.基于双高斯GMM的特征参数规整及其在语音识别中的应用[J].自动化学报,2006,32(4):519-525. 被引量：4
2R. C. Gonzalez, R. E. Woods. Digital Image Processing [ M ] , New Jersey, Prentice-Hall, 2002.
3O. Viikki, K. Laurila. Cepstral Domain Segmental Feature Vector Normalization for Noise Robust Speech Recogni- tion[ J ]. Speech Communication, 1998,1 (25) : 133-147.
4Hilger F, Molan S, Ney H. Quantile based histogram e- qualization for online application. Proceedings of Interna- tional Conference of Spoken Language Proceessing, Run- die Mall,Australia, Causal Productions,2002,237-240.
5Segura J C, Benitez M C, de la Torre A, Rubio A J. Fea- ture extraction combining spectral noise reduction and cepstral histogram equalization for robust ASR [ J ]. Pro- ceedings of International Conference of Spoken Language Processing 2002, Rundle Mall, Australia, Causal Produc- tions, 2002,225-228.
6Segura J C, Benitez M C, de la Torre A. VTS residual noise compensation [ J ]. Proceedings of International Conference on Acoustics and Signal Processing 2002.Piscataway, USA, IEEE Press,2002,209-212.
7J. C. Segura, C. Benitez, ~. de la Torre, A. J. Rubio, J. Ramfrez. Cepstral Domain Segmental Nonlinear Feature Transformations for Robust Speec Recognition [ J ]. IEEE Signal Processing Letters ,2004,5( 11 ) :517-520.
8Young S,Evermann G, Hain T et al. The HTK Book (for HTK Version 3.2.1 ). 2002, http : ff htk. eng. cam. ac. uk.
9H. Y. Jun. Filtering of Filter-Bank Energies for Robust Speech Recognition [ J ]. ETRI, 3 ( 26 ), 2004,273-276.

共引文献3

1许友亮,张连海,张文林,李永彬.基于语速调整和音位属性后验概率的音素识别[J].信号处理,2012,28(2):295-300. 被引量：5
2吕钊,吴小培,张超.鲁棒语音识别技术综述[J].安徽大学学报（自然科学版）,2013,37(5):17-24. 被引量：4
3李丹,贾桂敏,程方圆,杨金锋,郭晓静.陆空通话复诵语义自动化校验BiLSTM模型[J].信号处理,2019,35(1):57-64. 被引量：7

同被引文献29

1GARADAT S N, ZWOLAN T A, PFINGST B E.Across-site patterns of modulation detection: Relation to speech recogni- tion [J] .The Journal of the Acoustical Society of America, 2012, 131(5) : 4030-4041.
2DASGUPTA S. Experiments with random projection [C]// Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence. California, USA: Morgan Kaufmann Publishers,2000: 143-151.
3Atal B S.Automatic speaker recognition based on pitch contours. Journal of the Acoustical Society of America, The . 1972
4Sandipan C,,Anindya R,Sourav M,et al.Capturing complementary information via reversed filter bank and parallel implementation with MFCC for improved text-independent speaker identification. IEEE International Conference on Computing:Theory and Applica-tion . 2007
5江星华,李应.一种基于MFCC的音频数据检索方法[J].计算机与数字工程,2008,36(9):19-21. 被引量：7
6韩一,王国胤,杨勇.基于MFCC的语音情感识别[J].重庆邮电大学学报（自然科学版）,2008,20(5):597-602. 被引量：24
7宋倩倩,于凤芹.基于EMD和改进双门限法的语音端点检测[J].电声技术,2009,33(8):60-63. 被引量：13
8刘瑞华,黎芳,苏理云.基于双侧滤波的多幅灰色图像修复[J].计算机应用,2010,30(4):902-904. 被引量：2
9李泽,崔宣,马雨廷,陈俊宇.MFCC和LPCC特征参数在说话人识别中的研究[J].河南工程学院学报（自然科学版）,2010,22(2):51-55. 被引量：11
10罗茜,王鸿斌,张真,孔祥波.基于MFCC与神经网络的小蠹声音种类自动鉴别[J].北京林业大学学报,2011,33(5):81-85. 被引量：9

引证文献5

1田莎莎,唐菀,佘纬.改进MFCC参数在非特定人语音识别中的研究[J].科技通报,2013,29(3):139-142. 被引量：15
2陈慧,芮贤义.基于VC++的汽车语音驾驶助手的设计与实现[J].电声技术,2016,40(8):36-39. 被引量：1
3贾姗,徐正全,胡传博,王豪.基于重加密的随机映射指纹模板保护方案[J].通信学报,2018,39(2):122-134. 被引量：1
4袁立,赵行.基于DNA编码的图像加密算法[J].电子制作,2020,28(23):66-68. 被引量：2
5袁刚,李廷华,蒋友文,焦韬.一种基于MFCC与PCA的改进型语音识别算法[J].南阳理工学院学报,2015,7(6):56-60.

二级引证文献19

1杜文龙.一种提高语音特征参数稳健性MLMCC算法的研究[J].智能计算机与应用,2014,4(4):94-96.
2张建英,刘学航,冯翔.园林生态古镇遥感图像特征信息灰阶量化分析[J].科技通报,2014,30(8):212-214. 被引量：1
3华斌,张丽超,赵富强.基于加权MFCC的音频检索[J].计算机工程与应用,2015,51(8):200-204. 被引量：8
4邹东伯,刘海,赵亮,康迎杰.分布式光纤振动传感信号识别的研究[J].激光技术,2016,40(1):86-89. 被引量：28
5宣传忠,马彦华,武佩,张丽娜,郝敏,张曦宇.基于声信号特征加权的设施养殖羊行为分类识别[J].农业工程学报,2016,32(19):195-202. 被引量：19
6王丰华,王邵菁,陈颂,袁国刚,张君.基于改进MFCC和VQ的变压器声纹识别模型[J].中国电机工程学报,2017,37(5):1535-1542. 被引量：84
7刘利波,张文明.基于智能蚁群算法的移动机器人轨迹规划[J].机械与电子,2017,35(11):62-64. 被引量：2
8胡耀文,龙华,孙俊,周涛,邵玉斌.基于音频特征的乐器分类研究[J].软件导刊,2018,17(6):17-21. 被引量：3
9张文宇,刘畅.卷积神经网络算法在语音识别中的应用[J].信息技术,2018,42(10):147-152. 被引量：15
10蒋晓永,杨涛.基于变步长LMS和SVM的电能表内异物声音识别[J].传感器与微系统,2019,38(2):143-146. 被引量：6

1肖海勇,毕光国,张彭.采用子载波分组和空时码的随机映射MIMO OFDM系统[J].电路与系统学报,2008,13(4):130-134.
2赵晋明.一种基于随机映射的网络状态评估方法[J].电信科学,2016,32(8):164-168.
3韩兵,贾忠.噪声环境下鲁棒性语音特征提取的新方法[J].航空计算技术,1995,25(2):54-59.
4阎福智.语音信号处理中特征提取方法研究[J].中国新通信,2013,15(21):127-128. 被引量：1
5陈慧,芮贤义.基于VC++的汽车语音驾驶助手的设计与实现[J].电声技术,2016,40(8):36-39. 被引量：1
6张豫伟.信源的一种随机映射编码算法[J].通信学报,1994,15(2):88-92.
7王殿勇.论电子战情报中枢[J].电子对抗,1991(3):1-7.
8张焱,张杰,黄志同.基于听觉模型的鲁棒性语音识别的研究[J].模式识别与人工智能,1998,11(3):341-346.
9胡丹,曾庆宁,龙超.调制域谱减法用于鲁棒性语音识别[J].科学技术与工程,2016,16(4):216-220. 被引量：5
10李俊昌,管一弘,蔡光程,樊则宾.像素随机映射的快速算法及在LSB隐藏技术中的应用[J].光子学报,2010,39(8):1345-1350. 被引量：1

计算机应用

2012年第7期

浏览历史

内容加载中请稍等...

采用特征空间随机映射的鲁棒性语音识别被引量：5

参考文献17

二级参考文献9

共引文献3

同被引文献29

引证文献5

二级引证文献19

相关作者

相关机构

相关主题

浏览历史

采用特征空间随机映射的鲁棒性语音识别 被引量：5

参考文献17

二级参考文献9

共引文献3

同被引文献29

引证文献5

二级引证文献19

相关作者

相关机构

相关主题

浏览历史

采用特征空间随机映射的鲁棒性语音识别被引量：5