摘要
针对包含环境噪声和信道失真等噪声的语音处理问题,提出了一种基于自适应心理声学模型的智能语音识别系统,并建立了听觉模型.该模型将心理声学和耳声发射(OAE)合并到了自动语音识别(ASR)系统中,利用AURORA2数据库分别在清洁训练条件和多训练条件下进行试验.结果表明,所提出的特征提取方法可以显著提高词识别率,优于梅尔频率倒谱系数(MFCC)、前向掩蔽(FM)、侧向抑制(LI)和倒谱平均值及方差归一化(CMVN)算法,能够有效地提高智能语音识别系统的性能.
Aiming at such noise speech processing problems as environmental noise and channel distortion,an intelligent speech recognition system based on adaptive psychoacoustic system was proposed,and an auditory model was established. In the proposed model,the psychoacoustics and otoacoustic emission(OAE) were integrated into an automatic speech recognition(ASR) system. With the AURORA2 database,the experiments were performed under both clean and multiple training conditions,respectively.The results showthat the proposed feature extraction method can significantly improve the word recognition rate,is superior to those of Mel-frequency cepstral coefficients(MFCCs),forward masking(FM),lateral inhibition(LI) and cepstral mean variance normalization(CMVN) algorithms,and can effectively enhance the performance of intelligent speech recognition system.
出处
《沈阳工业大学学报》
EI
CAS
北大核心
2017年第6期675-679,共5页
Journal of Shenyang University of Technology
基金
江西省教育厅科学技术研究项目(GJJ151504
GJJ151505)
江西省教育改革课题资助项目(JXJG-14-28-3
JXJG-14-28-1
JXJG-14-28-6
JXJG-14-28-8)
关键词
梅尔频率倒谱系数
耳声发射
自适应
心理声学滤波器
自动语音识别
AURORA2数据库
前向掩蔽
侧向抑制
Mel-frequency cepstral coefficient(MFCC)
otoacoustic emission(OAE)
self-adaption
psychoacoustic filter
automatic speech recognition(ASR)
AURORA2 database
forward masking(FM)
lateral inhibition(LI)