期刊文献+

音乐多模态数据情感识别方法的研究

Research on Emotion Recognition Method of Music Multimodal Data
下载PDF
导出
摘要 音乐情感识别研究在音乐智能推荐和音乐可视化等领域有着广阔的应用前景.针对该研究中存在的仅利用低层音频特征进行情感识别时效果有限且可解释性差的问题,首先,构建能够学习音符语义信息的基于乐器数字接口(MIDI)数据的情感识别模型ERMSLM(emotion recognition model based on skip-gram and LSTM using MIDI data),该模型的特征是由基于跳字模型(skip-gram)和长短期记忆(LSTM)网络提取的旋律特征,利用预训练的多层感知机(MLP)提取的调性特征以及手动构建的特征3部分连接而成;其次,构建融合歌词和社交标签的基于文本数据的情感识别模型ERMBT(emotion recognition model based on BERT using text data),其中歌词特征是由基于BERT(bidirectional encoder representations from trans formers)提取的情感特征、利用英文单词情感标准(ANEW)列表所构建的情感词典特征以及歌词的词频—逆文本频率(TF-IDF)特征所组成;最后,围绕MIDI和文本两种数据构建特征级融合和决策级融合两种多模态融合模型.实验结果表明,ERMSLM和ERMBT模型分别可达到56.93%,72.62%的准确率,决策级多模态融合模型效果更优. The research of music emotion recognition has broad application prospects in the fields of music intelligent recommendation and music visualization.Aiming at the problem that only using low‑level audio features for emotion recognition has limited effectiveness and poor interpretability.Firstly,an emotion recognition model ERMSLM based on MIDI(musical instrument digital interface)data is constructed,which can learn the semantic information of notes.The features of this model are composed of melodic features extracted with skip‑gram and LSTM(long short‑term memory),tonal features extracted by pre‑trained MLP and manually constructed features.Secondly,an emotion recognition model ERMBT based on text data that integrates lyrics and social tags is constructed.The lyrics features are composed of emotional features extracted with BERT,emotional dictionary features constructed by using ANEW lists and TF-IDF features of lyrics.Finally,two multimodal fusion models of feature‑level fusion and decision‑level fusion are constructed based on MIDI and text data.The experimental results show that the ERMSLM and ERMBT models can achieve accuracies of 56.93%and 72.62%respectively.And the decision‑level multimodal fusion model is more effective.
作者 韩东红 孔彦茹 展艺萌 刘源 HAN Dong-hong;KONG Yan-ru;ZHAN Yi-meng;LIU Yuan(School of Computer Science&Engineering,Northeastern University,Shenyang 110169,China;NARI Group Corporation,State Grid Electric Power Research Institute,Nanjing 211000,China)
出处 《东北大学学报(自然科学版)》 EI CAS CSCD 北大核心 2024年第6期776-785,792,共11页 Journal of Northeastern University(Natural Science)
基金 国家自然科学基金资助项目(61672144) 国家重点研发计划项目(2019YFB1405302)。
关键词 音乐情感识别 深度学习 多模态 长短期记忆 music emotion recognition deep learning multimodal LSTM
  • 相关文献

参考文献4

二级参考文献85

  • 1孙守迁,王鑫,刘涛,汤永川.音乐情感的语言值计算模型研究[J].北京邮电大学学报,2006,29(z2):35-40. 被引量:9
  • 2van Bezooijen R,Otto SA,Heenan TA. Recognition of vocal expressions of emotion:A three-nation study to identify universal characteristics[J].{H}JOURNAL OF CROSS-CULTURAL PSYCHOLOGY,1983,(04):387-406.
  • 3Tolkmitt FJ,Scherer KR. Effect of experimentally induced stress on vocal parameters[J].Journal of Experimental Psychology Human Perception Performance,1986,(03):302-313.
  • 4Cahn JE. The generation of affect in synthesized speech[J].Journal of the American Voice Input/Output Society,1990.1-19.
  • 5Moriyama T,Ozawa S. Emotion recognition and synthesis system on speech[A].Florence:IEEE Computer Society,1999.840-844.
  • 6Cowie R,Douglas-Cowie E,Savvidou S,McMahon E,Sawey M,Schro. Feeltrace:An instrument for recording perceived emotion in real time[A].Belfast:ISCA,2000.19-24.
  • 7Grimm M,Kroschel K. Evaluation of natural emotions using self assessment manikins[A].Cancun,2005.381-385.
  • 8Grimm M,Kroschel K,Narayanan S. Support vector regression for automatic recognition of spontaneous emotions in speech[A].IEEE Computer Society,2007.1085-1088.
  • 9Eyben F,Wollmer M,Graves A,Schuller B Douglas-Cowie E Cowie R. On-Line emotion recognition in a 3-D activation-valencetime continuum using acoustic and linguistic cues[J].Journal on Multimodal User Interfaces,2010,(1-2):7-19.
  • 10Giannakopoulos T,Pikrakis A,Theodoridis S. A dimensional approach to emotion recognition of speech from movies[A].Taibe:IEEE Computer Society,2009.65-68.

共引文献190

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部