摘要
简述线性预测倒谱系数(LPCC)、Teager能量算子(TEO)、梅尔频率倒谱系数(MFCC)和过零峰值幅度(ZCPA)特征提取方法,并将这四种方法应用于情感识别。设计两种实验,第一种是使用TYUT和Berlin语料库的单语言实验,这种实验证明,以上四种特征在单一的语料库单一语言条件下均能够有效地表征语音的情感特征,其中MFCC特征对情感的识别率最高。第二种实验是混合语料库的单一语言实验。之前大多数关于情感特征的研究都是基于某一种语料库中某种特定语言的,但在实际中,说话人的背景环境总是多种多样。因此,对特征的混合语料库研究是有现实意义的。第二种实验证明这四种特征都是语料库依赖性的,其中ZCPA特征的识别率下降最少。
Four approaches of feature extraction: the Linear Predictive Cepstral Coefficient (LPCC), the Teager Energy Operator (TEO), the Mel-Frequency Cepstral Coefficient (MFCC) and the Zero Crossings with Peak Amplitudes (ZCPA) are described in this paper. And these approaches are applied to emotional speech recognition. Two kinds of experiments are carded out. The first one is a kind of single language experiments with TYUT database and Berlin database. Its results show that these four approaches can represent speech emotion effectively by using single language of single database. MFCC has the best result of the four approaches. The second kind experiment is merge-database of single language. Most previous work on emotional feature extraction is based on a special language of single speech database. But in practice, the environment of the speaker is various. So the study of emotional feature extraction based on merge-database is signifieative. Experiments of the second kind indicate that the four features are all database dependent. ZCPA features are of the least database dependence of the four approaches.
出处
《噪声与振动控制》
CSCD
北大核心
2011年第4期132-136,共5页
Noise and Vibration Control
基金
国家自然科学基金(No.61072087)
山西省自然科学基金(No.2010011020-1)
山西省研究生创新基金(No.20093010)
关键词
声学
信号处理
情感语音识别
语料库依赖性
情感特征
混合语料库
acoustics
signal analysis
emotional speech recognition
database dependence
emotional features
merge-database