基于动态单边自相关序列和频率规整线性预测的抗噪声语音识别被引量：5

Robust speech recognition based on dynamic one-sided autocorrelation sequence and frequency warped linear predictive coding

原文传递

导出

摘要提出了一种既符合人耳听觉特性又具有良好抗噪性的语音特征分析方法。首先将单边自相关函数序列进行时间方向的平滑处理,提高单边自相关函数的抗噪性,然后用平滑后的单边自相关函数序列代替原信号进行频率规整的LPC分析,最后经倒谱变换得到该特征参数。数字语音识别实验证明:利用该特征参数的语音识别系统的识别性能优于MEL倒谱系数、LPC倒谱系数等传统的语音特征参数。 A representation of speech that invariant to noise is introduced. The idea is to filter the temporal trajectories of short time One-Sided Autocorrelation Sequence (OSAS) of speech such that the noise effect is removed. The filtered sequences are denoted as Dynamic Autocorrelation Sequences (DAS). Then frequency warped LPC (WLPC) algorithm is applied to the DAS instead of the original speech. This speech feature set, which not only corresponds to the performance of human auditory property, but also improves the noise robustness of speech recognition, is denoted as DAS-WLPCC. Chinese digit recognition experiment based on continuous density HMM shows the effectiveness of DAS-WLPCC features in presence of white noise and color noise.

作者刘海滨吴镇扬赵力曾毓敏

机构地区东南大学无线电系南京师范大学物理系

出处《声学学报》 EI CSCD 北大核心 2004年第2期182-186,共5页 Acta Acustica

基金国家自然科学基金(69871009和60272044)

关键词动态单边自相关序列频率规整线性预测抗噪声语音识别语音特征分析自相关函数倒谱变换语音识别系统 Acoustic noise Audition Robustness (control systems) Speech analysis Speech coding

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献10

1Ivandro Sanches. Noise-compensated hidden Markov models. IEEE Trans on Speech and Audio Processing, 2000;8(5): 533-540.
2Hwang T H, Lee L M, Wang H C. Cepstral behavior due to additive noise and a compensation scheme for noisy speech recognition. IEEProc of Vis Image Signal Process, 1998;145(5): 316-321.
3Mansour D, Juang B H. The short-time modified coherence representation and its application for noisy speech recognition. IEEE Trans Acoust , Speech, Signal Processing,1980; 28(4): 357-366.
4Javier Hernando, Climent Nadeu. Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition. IEEE Transactions on Speech and Audio Processing, 1997; 5(1): 80-84.
5Davis S B, Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentence. IEEE Trans Acoust , Speech,Signal Processing, 1989; 37(6): 795-804.
6Yoon Kim, Smith J O. A speech feature based on bark frequency warping-the non-uniform linear prediction cepstrum. Proc of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New York, 1999(10):17-20.
7Rabiner L. Fundamentals of speech recognition. Prentice Hall, 1993.
8Smith J O, Abel J S. Bark and ERB bilineax transform. IEEE Trans on Speech and Audio Processing, 1999; 7(6):697-708.
9Aki Harma, Laine U K. A comparison of warped and conventional linear predictive coding. IEEE Trans on Speech and Audio Processing, 2001; 9(5): 579-588.
10杨行竣迟惠生.语音信号数字处理[M].北京：电子工业出版社,1999..

同被引文献57

1刘海滨,吴镇扬,赵力,曾毓敏.噪声环境下基于最大后验非线性变换的隐马尔可夫模型自适应算法[J].声学学报,2004,29(5):467-471. 被引量：4
2罗宇,杜利民.基于单高斯模型集的汉语美子带特征重建算法[J].电子学报,2004,32(10):1654-1657. 被引量：2
3王晶,傅丰林,张运伟.语音增强算法综述[J].声学与电子工程,2005(1):22-26. 被引量：20
4孙暐,吴镇扬,刘海滨.非线性统计匹配用于子带鲁棒语音识别[J].电子与信息学报,2006,28(3):480-484. 被引量：4
5赵蕤,王作英.语音识别中信道和噪音的联合补偿[J].声学学报,2006,31(5):466-470. 被引量：11
6Kim W, Hansen J H L. Feature compensation in the cepstral domain employing model combination. Speech Com- munication, 2009; 51(2): 83-96.
7Cui X, Alwan A. Noise robust speech recognition using feature compensation based on polynomial regression of utterance SNR. IEEE Trans. on Speech and Audio Processing, 2005; 13(6): 1161-1172.
8Gauvain J L, Lee C H. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans. on Speech and Audio Processing, 1994; 2(2): 291-298.
9Leggetter C J, Woodland P C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Computer Speech and Language, 1995; 9(2): 171-185.
10Gales M J F, Woodland P C. Mean and variance adaptation within the MLLR framework. Computer Speech and Language, 1996; 10(4): 249-264.

引证文献5

1彭圆,王晟,王科俊,李雪耀,林良骥,林正青,王建文.感知线性预测在水下目标分类中的应用研究[J].声学学报,2006,31(2):146-150. 被引量：16
2宁更新,韦岗.一种用于抗噪语音识别的动态参数补偿新方法[J].电路与系统学报,2008,13(2):14-19.
3吕勇,吴镇扬.基于最大似然多项式回归的鲁棒语音识别[J].声学学报,2010,35(1):88-96. 被引量：3
4LU Yong WU Zhenyang.Maximum likelihood polynomial regression for robust speech recognition[J].Chinese Journal of Acoustics,2011,30(3):358-370.
5吴海洋,杨飞然,周琳,吴镇扬.矢量泰勒级数特征补偿的说话人识别[J].声学学报,2013,38(1):105-112. 被引量：6

二级引证文献25

1李燕萍,唐振民,钱博,张燕.基于PLAR特征补偿的鲁棒性说话人识别仿真研究[J].系统仿真学报,2009,21(2):409-412. 被引量：2
2马元锋,陈克安,王娜,郑文.听觉模型输出谱特征在声目标识别中的应用[J].声学学报,2009,34(2):142-150. 被引量：20
3马元锋,陈克安,马苗,张成.一种新的可应用于声目标识别的倒谱系数[J].兵工学报,2009,30(11):1477-1483. 被引量：12
4MA Yuanfeng,CHEN Ke'an,SHI Fang.Application of auditory spectrum-based features into acoustic target recognition[J].Chinese Journal of Acoustics,2010,29(1):33-44.
5马元锋,陈克安,王云山,马苗.自适应听觉感知时频分析模型[J].声学学报,2010,35(4):393-402. 被引量：1
6王磊,彭圆,林正青,蒋行海,牟林,张凤珍.听觉外周计算模型在水中目标分类识别中的应用[J].电子学报,2012,40(1):199-203. 被引量：21
7吴姚振,杨益新,王晓宇.水下目标识别的1/3倍频程掩蔽谱方法[J].声学技术,2011,30(6):538-541. 被引量：2
8黄永明,章国宝,李雄,达飞鹏.全局特征及弱尺度融合策略的小样本语音情感识别[J].声学学报,2012,37(3):330-338. 被引量：9
9吴海洋,杨飞然,周琳,吴镇扬.矢量泰勒级数特征补偿的说话人识别[J].声学学报,2013,38(1):105-112. 被引量：6
10徐新洲,罗昕炜,方世良,赵力.基于听觉感知机理的水下目标识别研究进展[J].声学技术,2013,32(2):151-158. 被引量：10

1茹意.朗读对听力的促进作用[J].湖南环境生物职业技术学院学报,2006,12(3):349-351. 被引量：4
2方艳琼.忠县地名的语音特征分析[J].科教导刊（电子版）,2016,0(9):84-84.
3姜雪梅,张茜,崔东辉.试析英语中的连贯与阅读理解[J].大连大学学报,2012,33(2):125-129.
4张宜民.汉英詈语的语音特征分析[J].内江师范学院学报,2011,26(3):70-73. 被引量：2
5李晶晶.基于Reading Assistant的语音识别系统的语音教学初探[J].新课程学习（下）,2010(10):87-87.
6黄灵红.语音形式的句法研究价值──赵元任《中国话的文法》的启示[J].广东商学院学报,2001,16(2):77-80. 被引量：1
7何兰君.汉语方言对英语学习者发音的影响[J].家教世界,2013(8):267-268.
8胡乃杰,宋明艳.英汉时间隐喻对比研究[J].黑龙江教育学院学报,2014,33(7):139-141.
9高戈,李明,胡瑞敏.基音周期估计算法研究[J].声学学报,2003,28(6):540-544. 被引量：4
10杜春花.少数民族学生“地方普通话”语音特征分析[J].贵州民族大学学报（哲学社会科学版）,2014(2):117-120.

声学学报

2004年第2期

浏览历史

内容加载中请稍等...

基于动态单边自相关序列和频率规整线性预测的抗噪声语音识别被引量：5

参考文献10

同被引文献57

引证文献5

二级引证文献25

相关作者

相关机构

相关主题

浏览历史

基于动态单边自相关序列和频率规整线性预测的抗噪声语音识别 被引量：5

参考文献10

同被引文献57

引证文献5

二级引证文献25

相关作者

相关机构

相关主题

浏览历史

基于动态单边自相关序列和频率规整线性预测的抗噪声语音识别被引量：5