期刊文献+

融合功能性副语言比例系数的语音情感识别

Speech Emotion Recognition Fusing Functional Paralanguage Proportion Coefficient
下载PDF
导出
摘要 语言中的非言语发声如笑声、叹息、抽泣等,称为功能性副语言,对情感表达起重要作用,但现有研究很少考虑多种功能性副语言在一种情感中的协同作用.针对该问题,提出了融合功能性副语言比例系数(functional paralanguage proportion coefficient,FPPC)的情感识别系统.首先,提取能体现多种功能性副语言在情感语句中出现的频率快慢和持续时间长短的FPPC特征;然后,搭建基于注意力机制的集成学习(attention stacking)为不同的基分类器赋予不同权重,并对FPPC特征进行训练;最后,通过自适应熵权重决策融合方法将传统语音情感识别与基于FPPC特征情感识别进行融合.实验结果显示,融合了FPPC特征后的情感识别结果提高了16.84%,证明融合FPPC特征能有效提高系统整体识别率. Nonverbal vocalizations such as laughter,sighs,and sobs in speech are called functional paralanguage and play an important role in emotional expression.However,existing research has rarely considered the synergistic effect of multiple functional paralanguages in a single emotion.To address this issue,an emotion recognition system integrating functional paralanguage proportion coefficients(FPPC)is proposed.Firstly,FPPC features that reflect the frequency and duration of multiple functional paralanguages appearing in emotional statements are extracted.Then,an attention mechanism-based ensemble learning is constructed to assign different weights to different base classifiers and train the FPPC features.Finally,the adaptive entropy weight decision fusion method is used to fuse traditional speech emotion recognition with emotion recognition based on FPPC features.Experimental results show a 16.84%improvement in emotion recognition after integrating FPPC features,proving that integrating FPPC features can effectively improve the overall recognition rate of the system.
作者 孙颖 周雅茹 张雪英 SUN Ying;ZHOU Ya-ru;ZHANG Xue-ying(College of Information and Computer,Taiyuan University of Technology,Taiyuan 030024,China)
出处 《东北大学学报(自然科学版)》 EI CAS CSCD 北大核心 2024年第1期40-48,共9页 Journal of Northeastern University(Natural Science)
基金 国家自然科学基金资助项目(62271342) 山西省自然科学基金资助项目(201901D111096).
关键词 语音情感识别 比例系数 功能性副语言 注意力机制 自适应熵权重决策融合 speech emotion recognition proportion coefficient functional paralanguage attention mechanism adaptive entropy weight decision fusion
  • 相关文献

参考文献3

二级参考文献22

  • 1张一彬,周杰,边肇祺,张大鹏.一种基于内容的音频流二级分割方法[J].计算机学报,2006,29(3):457-465. 被引量:7
  • 2ISHI C T, ISHIGURO H, HAGITA N. Automatic extraction of para- linguistic information using prosodic features related to F0, duration and voice quality[ J]. SCI, Speech Communication 50, 2008 : 531 - 543.
  • 3CHENG S S, WANG H M.A. Sequential metric to based audio segmen- tation method via the Bayesian information criterion [ C]// Proceedings of Eurospeech. Geneva: University of Geneva, 2003:945 -948.
  • 4CHEN S S, GOPLALAKRISHNAN P. Speaker, environment and channel change detection and clustering via the Bayesian information criterion [ C ]// proceedings of the DARPA workshop. Lansdowne : [ s. n. ] , 1988 : 127 - 132.
  • 5CETI'OLO M, VESCOVI M. Efficient audio segmentation algorithms based on the BIC [ C ]//Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Hang Kong : IEEE, 2003 : 537 - 540.
  • 6Cettolo M, Vescovi M, Rizzi R. Evaluation of BIC based algorithms for audio segmentation [ J]. Computer Speech and Language, 2005, 19f2) : 147 -170.
  • 7MAO QiRong, WANG XiaoJia, ZHAN YongZhao. Speech emotion recognition method based on improved decision tree and layered fea- ture selection [ J ]. International Journal of Humanoid Robotics, 2010:245 - 261.
  • 8李艳雄,贺前华,陈楠,齐朝晖.基于谱稳定性特征的语音与笑声区分新方法[J].电子与信息学报,2008,30(6):1359-1362. 被引量:3
  • 9于俊清,胡小强,孙凯.改进的音频混合分割方法[J].计算机辅助设计与图形学学报,2010,22(7):1174-1181. 被引量:4
  • 10赵文博,王艇艇,张生,孙国强.基于矢量量化的婴儿哭声识别算法[J].微计算机信息,2011,27(4):224-225. 被引量:2

共引文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部