Acoustic features based on auditory model and adaptive fractional Fourier transform for speech recognition

Acoustic features based on auditory model and adaptive fractional Fourier transform for speech recognition

导出

摘要 It is well known that auditory system of human beings has excellent performance which automatic speech recognition （ASR） systems can＇t match, and fractional Fourier transform （FrFT） has unique advantages in non-stationary signal processing. In this paper, the Gammatone filterbank is applied to speech signals for front-end temporal filtering, and then acoustic features of the output subband signals are extracted based on fractional Fourier transform. Considering the critical effect of transform order for FrFT, an order adaptation method based on the instantaneous frequency is proposed, and its performance is compared with the method based on ambiguity function. ASR experiments are conducted on clean and noisy Putonghua digits, and the results show that the proposed features achieve significantly higher recognition rate than the MFCC baseline, and the order adaptation method based on instantaneous frequency has much lower complexity than that based on ambiguity function. Further more, the FrFT-based features achieve the highest recognition rate using the proposed order adaptation method. It is well known that auditory system of human beings has excellent performance which automatic speech recognition （ASR） systems can＇t match, and fractional Fourier transform （FrFT） has unique advantages in non-stationary signal processing. In this paper, the Gammatone filterbank is applied to speech signals for front-end temporal filtering, and then acoustic features of the output subband signals are extracted based on fractional Fourier transform. Considering the critical effect of transform order for FrFT, an order adaptation method based on the instantaneous frequency is proposed, and its performance is compared with the method based on ambiguity function. ASR experiments are conducted on clean and noisy Putonghua digits, and the results show that the proposed features achieve significantly higher recognition rate than the MFCC baseline, and the order adaptation method based on instantaneous frequency has much lower complexity than that based on ambiguity function. Further more, the FrFT-based features achieve the highest recognition rate using the proposed order adaptation method.

作者 YIN Hui XIE Xiang KUANG Jingming

机构地区 Department of Electronic Engineering

出处《Chinese Journal of Acoustics》 2011年第4期453-463,共11页 声学学报（英文版）

基金 supported by the National Science and Technology Major Projects(2010ZX03004-003-01) the National Natural Science Foundation of China(90920304) the Research Fund for the Doctoral Program of Higher Education of China(20101101110020)

分类号 TN912.34 [电子电信—通信与信息系统] O438 [机械工程—光学工程]

引文网络
相关文献

参考文献3

1QI Lin1, 2, TAO Ran1, ZHOU Siyong1 & WANG Yue1 1. Department of Electronic Engineering, Beijing Institute of Technology, Beijing 100081, China,2. School of Information Engineering, Zhengzhou University, Zhengzhou 450052, China Correspondence should be addressed to QI Lin (email: qilin@bit.edu.cn).Detection and parameter estimation of multicomponent LFM signal based on the fractional Fourier transform[J].Science in China(Series F),2004,47(2):184-198. 被引量：143
2赵兴浩,陶然,周思永,王越.基于Radon-Ambiguity变换和分数阶傅里叶变换的chirp信号检测及多参数估计[J].北京理工大学学报,2003,23(3):371-374. 被引量：43
3TAO Ran,DENG Bing,WANG Yue.Research progress of the fractional Fourier transform in signal processing[J].Science in China(Series F),2006,49(1):1-25. 被引量：100

二级参考文献30

1齐林,陶然,周思永,王越.LFM信号的一种最优滤波算法[J].电子学报,2004,32(9):1464-1467. 被引量：7
2邓兵,陶然,齐林,刘锋.分数阶Fourier变换与时频滤波[J].系统工程与电子技术,2004,26(10):1357-1359. 被引量：22
3赵羽,蔡平,周敏东.分数阶Fourier变换的数值计算[J].哈尔滨工程大学学报,2002,23(6):1-3. 被引量：9
4张卫强,陶然.分数阶傅里叶变换域上带通信号的采样定理[J].电子学报,2005,33(7):1196-1199. 被引量：30
5赵兴浩,陶然.基于分数阶相关的无源雷达动目标检测方法[J].电子学报,2005,33(9):1567-1570. 被引量：14
6陶然,周云松.基于分数阶傅里叶变换的宽带LFM信号波达方向估计新算法[J].北京理工大学学报,2005,25(10):895-899. 被引量：31
7蒋志平.分数阶傅里叶变换[J].量子电子学,1996,13(4):289-300. 被引量：5
8陈恩庆,陶然,张卫强.一种基于分数阶傅立叶变换的时变信道参数估计方法[J].电子学报,2005,33(12):2101-2104. 被引量：21
9孙晓兵,保铮.分数阶Fourier变换及其应用[J].电子学报,1996,24(12):60-65. 被引量：29
10Tao Ran, Ping Xianjun, Zhao Xinghao. Detection and estimation of moving targets based on fractional Fourier transform [Z]. International Conference on Signal Processing, Beijing, 2002.

共引文献237

1ZHANG Feng,TAO Ran,WANG Yue.Filterbank implementation for multi-channel sampling in fractional Fourier domain[J].Science China(Technological Sciences),2009,52(9):2619-2628. 被引量：1
2Nan Zhang, Ran Tao,Yue Wang.BRVAAF and performance analysis for target detection[J].Science China(Technological Sciences),2009,52(7):2096-2103. 被引量：1
3郎俊,陶然,冉启文,王越.多参数分数Fourier变换[J].中国科学（F辑:信息科学）,2009,39(3):329-339.
4张峰,陶然,王越.分数阶Fourier域过采样分析[J].中国科学：信息科学,2010,40(1):78-90. 被引量：1
5梅检民,肖云魁,曾锐利,李枫,任金成.基于分数阶聚能带分析的微弱故障特征提取研究[J].振动与冲击,2013,32(17):138-144. 被引量：1
6闫浩,董春曦,赵国庆.基于压缩感知的线性调频信号参数估计[J].电波科学学报,2015,30(3):449-456. 被引量：10
7李强,王其申.基于小波-Radon变换的线性调频信号检测与参数估计[J].信息与电子工程,2005,3(3):192-196. 被引量：10
8TAO Ran,DENG Bing,WANG Yue.Research progress of the fractional Fourier transform in signal processing[J].Science in China(Series F),2006,49(1):1-25. 被引量：100
9陶然,邓兵,王越.分数阶FOURIER变换在信号处理领域的研究进展[J].中国科学（E辑）,2006,36(2):113-136. 被引量：80
10李家强,金荣洪,耿军平,范瑜,毛炜.基于分数阶频率域混合相关的线性调频信号检测与参数估计[J].上海交通大学学报,2006,40(9):1478-1482. 被引量：4

1ZHAO Heming WANG Yongqi CHEN Xueqin.Auditory model inversion and its application[J].Chinese Journal of Acoustics,2005,24(4):323-330.
2钱惠生,金朝晖,刘顺兰.小波分析与应用[J].杭州电子工业学院学报,1999,19(1):1-8. 被引量：5
3WANG Chengyou,TANG Shuqi,LIANG Diannong,CHEN Huihuang and TANG Zhaojing(National University of Defence Technology Changsha 410073)Received.The methods for combining the information of various kinds of features in speech recognition[J].Chinese Journal of Acoustics,1997,16(2):115-120.
4唐忠平.Hilbert-Huang变换在非平稳信号处理中的应用[J].信息系统工程,2014(7):34-35. 被引量：1
5HUANG Chao,YANG LiHua.Approximation by the nonlinear Fourier basis[J].Science China Mathematics,2011,54(6):1207-1214.
6樊迎迎,刘卓夫,汤泰青.Hilbert-Huang算法研究[J].电子设计工程,2012,20(6):23-25. 被引量：7
7邢敬华,赵新泽,靳红涛,陈洪军.离散小波包分析在非平稳信号处理中的应用[J].三峡大学学报（自然科学版）,2005,27(1):52-54. 被引量：2
8陈卫,吕贵洲,梁四洋,何强.参数化时频分析综述[J].现代雷达,2008,30(2):61-64. 被引量：4
9Liu Gang Chen Wei Guo Jun.Novel Active Learning Method for Speech Recognition[J].China Communications,2010,7(5):29-39. 被引量：1
10《局域波分析理论及工程应用》——新书推介[J].应用科技,2016,43(3):86-86.

Chinese Journal of Acoustics

2011年第4期

浏览历史

内容加载中请稍等...

Acoustic features based on auditory model and adaptive fractional Fourier transform for speech recognition

参考文献3

二级参考文献30

共引文献237

相关作者

相关机构

相关主题

浏览历史