摘要
利用希尔伯特-黄变换(Hilbert-Huang Transform,HHT)对情感语音进行处理,得到其边际谱,然后对比分析四种情感即高兴、生气、厌恶、无情感语音信号边际谱的特征,提出四个特征量:子带能量(SE)、子带能量的一阶差分(DSE)、子带能量倒谱系数(SECC)、子带能量倒谱系数的一阶差分(DSECC)用于情感识别。用它们作说话人无关,文本无关的语音情感识别,得到最高90%的识别率,比基于傅立叶变换的梅尔频率倒谱系数(MFCC)高22个百分点。实验结果表明,基于HHT边际谱的特征能够较好地反映语音信号中的情感信息。
Marginal spectrum of the emotional speech is obtained through Hilbert-Huang Transform. Speech signals of four different emotions, namely happy, angry, boring and natrual, are analyzed contrastively focusing on the characteristics of the marginal spectrum. Then four features: SE, DSE, SECC and DSECC are extracted for emotion recognition. Finally speaker-independent and text-independent emotion recognitions are simulated by using these features respectively, which gains the best recognition rate of 90%, which is 22 percentage higher than Fourier Transform based feature MFCC. Thus, conclusion is drawn that HHT marginal spectrum can well reflect the emotional information in speech.
出处
《声学技术》
CSCD
2009年第2期148-152,共5页
Technical Acoustics
关键词
情感识别
边际谱
HHT
emotion recognition
marginal spectrum
HHT