摘要
特征提取是情感语音识别系统的关键过程,决定系统整体识别性能。传统特征提取技术假定语音信号是线性、短时平稳信号,不具有自适应性。为此,通过聚合经验模态分解(EEMD)算法以非线性的处理方式提取特征。情感语音信号经EEMD分解后得到一组固有模态函数(IMF),利用相关系数法筛选出有效分量集合,对集合函数计算得到IMF能量特征(IMFE)。选用德国柏林语音库作为实验数据来源,将IMFE特征、韵律特征、梅尔倒谱系数特征以及三者的融合特征分别输入到支持向量机中,通过比较不同特征的识别结果验证IM FE特征的有效性。实验结果表明,IM FE特征与声学特征融合后的平均识别率达到91.67%,可有效区分不同的情感状态。
Extracting features of emotional speech signal is particularly important in the emotional speech recognition systems, which determines the overall recognition performance. The traditional feature extraction techniques assume speech signal is linear and short-stationary, without self-adapability. By using the Ensemble Empirical Mode Decomposition(EEMD) algorithm, the features are extracted in a nonlinear way. First, the emotional speech signal is decomposed into a series of Intrinsic Mode Function(IMF) by EEMD and effective IMFs set is selected using correlation coefficient method. Then the IMF Energy (IMFE) characteristics are obtained through calculation of the function in the set. In the experiment, Berlin speech database is chosen as the data source. IMFE features, prosodic features, Mel- Fregurecy Cepstrum Coefficients(MFCC) features and the fusion features of the three are input inte SVM respectively. The recognition results of different feature combinations are compared to validate the performance of the IMFE features. The experimental results show that the average recognition rate of IMFE feature merging with acoustic feature can reach 91.67% ,and IMFE can effectively distingwish between different states.
出处
《计算机工程》
CAS
CSCD
北大核心
2017年第8期306-309,315,共5页
Computer Engineering
基金
国家自然科学基金(61371193)
山西省回国留学人员科研基金(2013-034)
关键词
特征提取
聚合经验模态分解
固有模态函数
Spearman
Rank相关系数
声学特征
情感语音识别
feature extraction
Ensemble Empirical Mode Decomposition (EEMD)
Intrinsic Mode Function (IMF)
Spearman Rank correlation coefficient
acoustic feature
emotional speech recognition