期刊文献+

轻量型胶囊网络语音情感识别方法 被引量:2

A Speech Emotion Recognition Method Based on Lightweight Capsule Network
下载PDF
导出
摘要 针对目前语音情感识别模型参数多、运算量大、训练速度慢等问题,提出了一种适用于小数据集、轻量型的网络模型。模型以胶囊网络为基础结构,引入深度可分离卷积模块代替胶囊网络中原有的卷积层以减少计算量。基于迁移学习提取普适的底层图像特征,利用语谱图来微调整个网络,减弱模型在小数据集上的过拟合现象。再利用夹角余弦来计算动态路由结构中向量的相似度,提高动态路由算法性能。实验结果表明,轻量型胶囊网络的识别率和运算速度均优于对比的7种深度学习网络模型。 Aiming at the problems of many parameters,large amount of computation and slow training speed of the current speech emotion recognition model,this paper proposes a lightweight network model suitable for small data sets.The model is based on the capsule network,and the deep separable convolution module is introduced to replace the original convolution layer in the capsule network to reduce the amount of calculation.Transfer learning is used to extract the universal underlying image features,and then spectrogram is used to finely tune the over fitting phenomenon of the whole network weakening model on small data sets.The angle cosine is used to calculate the vector similarity in the dynamic routing structure so as to improve the performance of the dynamic routing algorithm.The experimental results show that the recognition rate and operation speed of the lightweight capsule network are better than the seven deep learning network models.
作者 王颖 高胜 WANG Ying;GAO Sheng(School of Computer and Information Technology,Northeast Petroleum University,Daqing Heilongjiang 163318;School of Mechanical Science and Engineering,Northeast Petroleum University,Daqing Heilongjiang 163318)
出处 《电子科技大学学报》 EI CAS CSCD 北大核心 2023年第3期423-429,共7页 Journal of University of Electronic Science and Technology of China
基金 国家自然科学基金(61702093) 国家重点研发计划(2018YFE0196000) 黑龙江省自然科学基金(F2018003) 黑龙江省博士后专项(LBH-Q20077)。
关键词 胶囊网络 深度可分离卷积 语音情感识别 迁移学习 capsule network depth separable convolution speech emotion recognition transfer learning
  • 相关文献

参考文献10

二级参考文献126

  • 1van Bezooijen R,Otto SA,Heenan TA. Recognition of vocal expressions of emotion:A three-nation study to identify universal characteristics[J].{H}JOURNAL OF CROSS-CULTURAL PSYCHOLOGY,1983,(04):387-406.
  • 2Tolkmitt FJ,Scherer KR. Effect of experimentally induced stress on vocal parameters[J].Journal of Experimental Psychology Human Perception Performance,1986,(03):302-313.
  • 3Cahn JE. The generation of affect in synthesized speech[J].Journal of the American Voice Input/Output Society,1990.1-19.
  • 4Moriyama T,Ozawa S. Emotion recognition and synthesis system on speech[A].Florence:IEEE Computer Society,1999.840-844.
  • 5Cowie R,Douglas-Cowie E,Savvidou S,McMahon E,Sawey M,Schro. Feeltrace:An instrument for recording perceived emotion in real time[A].Belfast:ISCA,2000.19-24.
  • 6Grimm M,Kroschel K. Evaluation of natural emotions using self assessment manikins[A].Cancun,2005.381-385.
  • 7Grimm M,Kroschel K,Narayanan S. Support vector regression for automatic recognition of spontaneous emotions in speech[A].IEEE Computer Society,2007.1085-1088.
  • 8Eyben F,Wollmer M,Graves A,Schuller B Douglas-Cowie E Cowie R. On-Line emotion recognition in a 3-D activation-valencetime continuum using acoustic and linguistic cues[J].Journal on Multimodal User Interfaces,2010,(1-2):7-19.
  • 9Giannakopoulos T,Pikrakis A,Theodoridis S. A dimensional approach to emotion recognition of speech from movies[A].Taibe:IEEE Computer Society,2009.65-68.
  • 10Wu DR,Parsons TD,Mower E,Narayanan S. Speech emotion estimation in 3d space[A].Singapore:IEEE Computer Society,2010.737-742.

共引文献296

同被引文献20

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部