
基于表情和语音的多模态情感识别研究 被引量:3

Multi-modal Emotion Recognition Based on Video and Audio
摘要 由于单一特征的局限性,单一模态的情感识别研究往往由于含有的有效信息量较少或含有的噪声信息过多而导致识别结果与实际情况有着较大的差异。而不同类型的输入特征,相对于单一特征而言,包含着充分的、互补的情感信息。因此,本研究基于eNTERFACE数据库,提取了SIFT特征作为表情特征数据以及使用openSMILE工具包提取的1 582维声学及统计特征作为语音特征数据,分别运用支持向量机SVM和稀疏表示SR方法进行情感识别。最后采用决策层融合的方式,在该数据库上获得了比较好的效果。 Single modality is usually far from satisfactory due to insufficient data or overmuch noi ferent sensors may carry redundant, complementary information, and lead to improve the performance. fore we use eNTERFACE database, extract the SIFT feature as the face emotion feature and using open tools extract 1582 dimension speech feature, and classify by the SVM and SR. Finally we fuse the mu on the score level, and achieve the best recognition results. se. Dif- There- SMILE ltimodal
作者 王蓓 王晓兰
出处 《信息化研究》 2014年第1期48-50,共3页 INFORMATIZATION RESEARCH
关键词 多模态 视频 语音 情感识别 Multi-modal video audio emotion recognition
  • 相关文献


  • 1Gajsek R, Struc V, Mihelic F. Multi-modal emotion recogni- tion using canonical correlations and acoustic features[C]//Pattern recognition(ICPR),2010 20th International confer- ence on. IEEE,2010:4133 - 4136.
  • 2Wang Y, Guan L, Venetsanopoulos A N. Kernel cross-mo- dal factor analysis for information fusion with application to bimodal emotion recognition[J]. Multimedia, IEEE Trans- actions on, 2012,14 (3) : 597 - 607.
  • 3Paleari M, H uet B, Chellali R. Towards multimodal emotion recognition:a new approaeh[C]//Proceedings of the ACM international conference on image and video retrieval. ACM, 2010 : 174 - 181.
  • 4Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision, 2004,60(2) :91 - 110.
  • 5Yang J, Yu K, Gong Y, et al. Linear spatial pyramid matc- hing using sparse coding for image cIassification[C]//Com- puter vision and pattern recognition, CVPR 2009. IEEE Conference on. IEEE, 2009:1794- 1801.
  • 6Cortes C, Vapnik V. Support-vector networks[J]. Machine learning, 1995,20(3) .. 273 - 297.
  • 7Pao T, Chen Y, Yeh J. Emotion recognition and evaluation from mandarin speech signals[J]. International journal of innovative computing, Information and Control, 2008,4 (7) : 1695 - 1709.
  • 8Eyben F, WOllmer M, Schuller B. Opensmile. the munich versatile and fast open-source audio feature extractor [C]// Proceedings of the international conference on Multimedia. ACM,2010: 1459 - 1462.
  • 9Wright J, Yang A Y, Ganesh A, et al. Robust face recogni- tion via sparse representation[J]. Pattern analysis and ma- chine intelligence, IEEE Transactions on, 2009,31 (2) : 210 - 227.
  • 10Martin O,Kotsia I,Macq B,et al. The enterface'05 audio- visual emotion database [C]//Data engineering work- shops, 2006. Proceedings. 22nd international conference on. IEEE,2006 : 8 - 8.











使用帮助 返回顶部