期刊文献+

基于参数迁移和卷积循环神经网络的语音情感识别 被引量:29

Speech Emotion Recognition Model Based on Parameter Transfer and Convolutional Recurrent Neural Network
下载PDF
导出
摘要 在语音情感识别研究中,已有基于深度学习的方法大多没有针对语音时频两域的特征进行建模,且存在网络模型训练时间长、识别准确性不高等问题。语谱图是语音信号转换后具有时频两域的特殊图像,为了充分提取语谱图时频两域的情感特征,提出了一种基于参数迁移和卷积循环神经网络的语音情感识别模型。该模型把语谱图作为网络的输入,引入AlexNet网络模型并迁移其预训练的卷积层权重参数,将卷积神经网络输出的特征图重构后输入LSTM(Long Short-Term Memory)网络进行训练。实验结果表明,所提方法加快了网络训练的速度,并提高了情感识别的准确率。 In the study of speech emotion recognition, most methods based on deep learning don’t model the time-frequency characteristics of speech. Moreover, the network model has long training time and the recognition accuracy is not high.The spectrogram is a special image with both time and frequency domains after the conversion of speech signals. In order to fully extract the emotional features of time-frequency domain of the spectrogram, this paper proposes a speech emotion recognition model based on parameter transfer and convolutional recurrent neural network. The proposed model uses the spectrogram as the input of network, introduces the AlexNet network model, and transfers its weighting parameters of pretrained convolutional layer. The output feature maps of convolutional neural network is put into long short-term memory neural networks for training after being reconstructed. The experimental results show that the proposed method has faster speed of network training and higher accuracy of emotion recognition.
作者 缪裕青 邹巍 刘同来 周明 蔡国永 MIAO Yuqing;ZOU Wei;LIU Tonglai;ZHOU Ming;CAI Guoyong(School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, Guangxi541004, China;Guilin Hivision Technology Co. Ltd., Guilin, Guangxi 541004, China)
出处 《计算机工程与应用》 CSCD 北大核心 2019年第10期135-140,198,共7页 Computer Engineering and Applications
基金 国家自然科学基金(No.61763007) 广西自然科学基金(No.2014GXNSFAA118395) 广西高校图像图形智能处理重点实验室研究项目(No.GIIP201706) 广西自然科学基金重点项目(No.2017GXNSFDA198028) 桂林电子科技大学研究生教育创新计划资助项目(No.2016YJCX72 No.2017YJCX50)
关键词 语谱图 深度学习 参数迁移 卷积循环神经网络 语音情感识别 spectrogram deep learning parameter transfer convolutional recurrent neural network speech emotion recognition
  • 相关文献

参考文献2

二级参考文献87

  • 1van Bezooijen R,Otto SA,Heenan TA. Recognition of vocal expressions of emotion:A three-nation study to identify universal characteristics[J].{H}JOURNAL OF CROSS-CULTURAL PSYCHOLOGY,1983,(04):387-406.
  • 2Tolkmitt FJ,Scherer KR. Effect of experimentally induced stress on vocal parameters[J].Journal of Experimental Psychology Human Perception Performance,1986,(03):302-313.
  • 3Cahn JE. The generation of affect in synthesized speech[J].Journal of the American Voice Input/Output Society,1990.1-19.
  • 4Moriyama T,Ozawa S. Emotion recognition and synthesis system on speech[A].Florence:IEEE Computer Society,1999.840-844.
  • 5Cowie R,Douglas-Cowie E,Savvidou S,McMahon E,Sawey M,Schro. Feeltrace:An instrument for recording perceived emotion in real time[A].Belfast:ISCA,2000.19-24.
  • 6Grimm M,Kroschel K. Evaluation of natural emotions using self assessment manikins[A].Cancun,2005.381-385.
  • 7Grimm M,Kroschel K,Narayanan S. Support vector regression for automatic recognition of spontaneous emotions in speech[A].IEEE Computer Society,2007.1085-1088.
  • 8Eyben F,Wollmer M,Graves A,Schuller B Douglas-Cowie E Cowie R. On-Line emotion recognition in a 3-D activation-valencetime continuum using acoustic and linguistic cues[J].Journal on Multimodal User Interfaces,2010,(1-2):7-19.
  • 9Giannakopoulos T,Pikrakis A,Theodoridis S. A dimensional approach to emotion recognition of speech from movies[A].Taibe:IEEE Computer Society,2009.65-68.
  • 10Wu DR,Parsons TD,Mower E,Narayanan S. Speech emotion estimation in 3d space[A].Singapore:IEEE Computer Society,2010.737-742.

共引文献168

同被引文献213

引证文献29

二级引证文献93

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部