摘要
提出了一种新的连续语音情感识别特征:语音元音段声门激励的时域参数归一化振幅商(the normalized amplitude quotient,NAQ)。该方法首先运用迭代自适应逆滤波器(Iterative Adaptive Inverse Filtering,IAIF)估计声门波,然后采用NAQ值来描述声门开启和闭合的特性。采用eNERFACE’05听视觉情感语音数据库中六种不同情感的语音为实验数据,以情感语音元音段的归一化振幅商值为特征,使用直方图和盒形图分析其特征的分布和对情感的区分能力;以情感语句元音段的NAQ值的均值、方差、最大值、最小值作为特征,用高斯混合模型(Gaussian Mixture Models,GMM)和k-近邻法进行了语音情感识别实验,结果表明NAQ特征对语音情感具有较强的区别能力。
A time - domain parameter of the glottal flow, the normalized amplitude quotient (NAQ) is presented as a new emotion feature in this paper. Six emotional speeches from the eNTERFACE'05 audio -visual emotion database are inversely filtered using Iterative Adaptive Inverse Filtering (IAIF) to estimate the glottal flow and parameterized using NAQ. To evaluate the properties of the emotion features based on NAQ values, firstly, the histogram and boxplot of NAQ features are plotted to see their ability of distinguishing different emotions. Then, the mean, variance, maximum value and minimum value of NAQ features are used in speech emotion classification using Gaussian Mixture Models andk - nearest neighbor classifier. Experimental results show that NAQ value of vowel segments can be used as an effective emotion feature in emotion recognition from speech.
出处
《计算机仿真》
CSCD
北大核心
2009年第2期183-186,共4页
Computer Simulation
基金
国家自然科学基金项目(60703104)
关键词
归一化振幅商
迭代自适应逆滤波
高斯混合模型
近邻法
Normalized amplitude quotient ( NAQ )
Iterative adaptive inverse filtering (IAIF)
Ganssian mixture models (GMM)
Nearest neighbor algorithm