期刊文献+

归一化振幅商在语音情感识别中的应用 被引量:1

Normalized Amplitude Quotient Feature in Emotion Recognition
下载PDF
导出
摘要 提出了一种新的连续语音情感识别特征:语音元音段声门激励的时域参数归一化振幅商(the normalized amplitude quotient,NAQ)。该方法首先运用迭代自适应逆滤波器(Iterative Adaptive Inverse Filtering,IAIF)估计声门波,然后采用NAQ值来描述声门开启和闭合的特性。采用eNERFACE’05听视觉情感语音数据库中六种不同情感的语音为实验数据,以情感语音元音段的归一化振幅商值为特征,使用直方图和盒形图分析其特征的分布和对情感的区分能力;以情感语句元音段的NAQ值的均值、方差、最大值、最小值作为特征,用高斯混合模型(Gaussian Mixture Models,GMM)和k-近邻法进行了语音情感识别实验,结果表明NAQ特征对语音情感具有较强的区别能力。 A time - domain parameter of the glottal flow, the normalized amplitude quotient (NAQ) is presented as a new emotion feature in this paper. Six emotional speeches from the eNTERFACE'05 audio -visual emotion database are inversely filtered using Iterative Adaptive Inverse Filtering (IAIF) to estimate the glottal flow and parameterized using NAQ. To evaluate the properties of the emotion features based on NAQ values, firstly, the histogram and boxplot of NAQ features are plotted to see their ability of distinguishing different emotions. Then, the mean, variance, maximum value and minimum value of NAQ features are used in speech emotion classification using Gaussian Mixture Models andk - nearest neighbor classifier. Experimental results show that NAQ value of vowel segments can be used as an effective emotion feature in emotion recognition from speech.
作者 白洁 蒋冬梅
出处 《计算机仿真》 CSCD 北大核心 2009年第2期183-186,共4页 Computer Simulation
基金 国家自然科学基金项目(60703104)
关键词 归一化振幅商 迭代自适应逆滤波 高斯混合模型 近邻法 Normalized amplitude quotient ( NAQ ) Iterative adaptive inverse filtering (IAIF) Ganssian mixture models (GMM) Nearest neighbor algorithm
  • 相关文献

参考文献10

  • 1Tato Requel, Santos Bocio, Kompe Ralf, J M Pardo. Emotion space improves emotion recognition [ C ]. Proe. ICSLP. Denver, Colorado. 2002,3 : 2029 -2032.
  • 2Laver John. The Phonetic Description of Voice Quality[ M]. Cambridge University Press, 1980.
  • 3Klans R Seherer. Vocal affect expression: A review and a model for future research[ J]. Psychological Bulletin, 1986,99 ( 2 ) : 143 - 165.
  • 4Gobl Christer, Chasaide Ailbhe Ni. The role of voice quality in communicating emotion, mood and attitude[ J]. Speech Communication,2003,40: 189 - 212.
  • 5Alku Paavo, Backstrom Tom, Vilkman Erhhi. Normalized amplitude quotient for parameterization of the glottal flow[ J]. Journal of the Acoustical Society of America, 2002, 112(2) : 701 -710.
  • 6Lehto Laura, et al. Comparison of two inverse filtering methods in parameterization of the glottal closing phase characteristics in different phonation types[J]. Journal Voice, 2007, 21 (2) : 138 - 150.
  • 7Airas Matti, Alku Paavo. Emotions in vowel segments of continuous speech: Analysis of the glottal flow using the normalized amplitude quotient[ J]. Phonetica,2006, 63 ( 1 ) : 26 - 46.
  • 8O Martin, et al. The eNTERFACE' 05 audio - visual emotion database[ C]. Proceedings of the 22nd International Conference on Data Engineering Workshops, 2006.
  • 9Alku Paavo. Glottal wave analysis with pitch synchronous iterative adaptive, inverse filtering [ J ]. Speech Communication, 1992, 11 (2-3): 109-118.
  • 10S J Young. The HTK Hidden Markov Model Toolkit: Design and Philosophy[ R]. Technical Report, CUED, Cambridge University, 1994.

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部