期刊文献+

融合声门波信号频谱特征的语音情感识别

Speech Emotion Recognition Combined with the Spectrum Feature of Glottal Waveform
下载PDF
导出
摘要 为了提高语音情感识别的准确率,本文针对新的声门波信号频谱特征抛物线频谱参数(parabolic spectralparameter,PSP)和谐波丰富因子(harmonic richness factor,HRF)进行了研究,并将其应用到语音的情感识别中.提取6种不同情感(生气、害怕、高兴、中性、悲伤和惊奇)语音信号的发音速率和短时能量、基音频率、前3个共振峰、12阶Mel频率倒谱系数(MFCC)的最大值、最小值、变化范围和平均值等常用特征构成一个特征矢量,并利用主成分分析方法降维;提取声门波信号的频谱特征PSP和HRF,并分析了PSP和HRF的情感表达能力;采用深度学习栈式自编码算法对只有常用特征以及融合了声门波信号频谱特征后的特征进行分类.结果表明:融合声门波信号频谱特征后识别率更高. In order to improve the accuracy of emotional speech recognition,the parabolic spectral parameter(PSP)and harmonic richness factor(HRF)which are frequent domain features of the glottal waveform are analyzed,and they are applicated in speech emotion recognition.First of all,acquisition the pronunciation rate and the maximum,minimum,range and average of pitch frequency,first three formant parameters,12 order Mel frequency cepstrum coefficients(MFCC)of six different emotions speech signals(angry,fear,happy,neutral,sad,surprise)to construct a feature vector,And use principal component analysis(PCA)method to reduce the vector dimension;Then,extract PSP and HRF of the glottal waveform,and analyze the emotional expression ability of PSP and HRF;Finally,using the stacked autoencoderclassifier aims to classify the features which are traditional and have the characteristics of the glottal signal.The results show that it can achieve a higher recognition rate to combine with thethe spectrum feature of glottal waveform.
作者 李昊璇 师宏慧 乔晓艳 LI Haoxuan SHI Honghui QIAO Xiaoyan(College of Physics and Electronics Engineering, Shanxi University, Taiyuan 030006, Chin)
出处 《测试技术学报》 2017年第1期8-16,共9页 Journal of Test and Measurement Technology
基金 山西省回国留学人员科研资助项目(2014-010) 山西省自然科学基金资助项目(2013011016-2)
关键词 声门波信号 抛物线频谱参数 谐波丰富因子 栈式自编码 语音情感识别 glottal waveform parabolic spectral parameter harmonic richness factor stacked autoen-coder speech emotional recognition
  • 相关文献

参考文献3

二级参考文献97

  • 1谢波,陈岭,陈根才,陈纯.普通话语音情感识别的特征选择技术[J].浙江大学学报(工学版),2007,41(11):1816-1822. 被引量:13
  • 2韩文静,李海峰,韩纪庆.基于长短时特征融合的语音情感识别方法[J].清华大学学报(自然科学版),2008,48(S1):708-714. 被引量:20
  • 3蒋丹宁,蔡莲红.基于语音声学特征的情感信息识别[J].清华大学学报(自然科学版),2006,46(1):86-89. 被引量:37
  • 4林奕琳,韦岗,杨康才.语音情感识别的研究进展[J].电路与系统学报,2007,12(1):90-98. 被引量:33
  • 5Tato Requel, Santos Bocio, Kompe Ralf, J M Pardo. Emotion space improves emotion recognition [ C ]. Proe. ICSLP. Denver, Colorado. 2002,3 : 2029 -2032.
  • 6Laver John. The Phonetic Description of Voice Quality[ M]. Cambridge University Press, 1980.
  • 7Klans R Seherer. Vocal affect expression: A review and a model for future research[ J]. Psychological Bulletin, 1986,99 ( 2 ) : 143 - 165.
  • 8Gobl Christer, Chasaide Ailbhe Ni. The role of voice quality in communicating emotion, mood and attitude[ J]. Speech Communication,2003,40: 189 - 212.
  • 9Alku Paavo, Backstrom Tom, Vilkman Erhhi. Normalized amplitude quotient for parameterization of the glottal flow[ J]. Journal of the Acoustical Society of America, 2002, 112(2) : 701 -710.
  • 10Lehto Laura, et al. Comparison of two inverse filtering methods in parameterization of the glottal closing phase characteristics in different phonation types[J]. Journal Voice, 2007, 21 (2) : 138 - 150.

共引文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部