摘要
为了提高噪音环境下语音识别的鲁棒性,提出了一种基于共振峰曲线的语音信号动态特征提取方法。采用基于Hilbert-Huang变换的方法来估算预处理后的语音信号共振峰频率特征,然后按照从第一帧到最后一帧的帧序,将预处理后的每帧语音信号的第一共振峰频率特征值进行组合获得第一共振峰曲线,依此类推,获得第二共振峰曲线、第三共振峰曲线及第四共振峰曲线。对获得的每条共振峰曲线进行快速傅里叶变换获得线性频谱,然后再求取能量谱,计算对数能量和离散余弦变换。与MFCC方法相比,提取的语音信号动态特征具有时间相关性,揭示了语音信号前后以及相邻之间存在的密切关联,提高了语音识别的性能。
In order to improve the robusmess of speech recognition in noise environment, a dynamic feature extraction for speech signal based on formant curve is put forward. It uses Hilbert-Huang transform to estimate speech signal formant frequency characteristics after preprocessing, and then gets the first formant curve by combining the first formant frequency characteristics of each frame from the first frame to the last frame, and so forth, gets the second, the third and the fourth formant curve. And then takes Fast Fourier Transform for each formant curve to obtain linear spectrum, and calculates the energy spectrum,logarithmic energy and discrete cosine transform. Com- pared with the method of MFCC, the proposed dynamic feature of speech signal has the time correlation, revealing the close correlation between the speech signal frames, improving the performance of speech recognition.
出处
《计算机技术与发展》
2017年第6期72-75,80,共5页
Computer Technology and Development
基金
国家自然科学基金资助项目(61403042
61503038)
辽宁省教育科研项目(L2013423)
关键词
语音信号
动态特征
语音识别
特征提取
共振峰曲线
speech signal
dynamic feature
speech recognition
feature extraction
formant curve