期刊文献+

融合功能性副语言的语音情感识别新方法 被引量:5

New Method of Speech Emotion Recognition Fusing Functional Paralanguages
下载PDF
导出
摘要 针对声音突发特征(笑声、哭声、叹息声等,称之为功能性副语言)携带大量情感信息,而包含这类突发特征的语句由于特征突发性的干扰整体情感识别率不高的问题,提出了融合功能性副语言的语音情感识别方法。该方法首先对待识别语句进行功能性副语言自动检测,根据检测结果将功能性副语言从语句中分离,从而得到较为纯净的两类信号:功能性副语言信号和传统语音信号,最后将两类信号的情感信息使用自适应权重融合方法进行融合,从而达到提高待识别语句情感识别率和系统鲁棒性的目的。在包含6种功能性副语言和6种典型情感的情感语料库上的实验表明:该方法在与人无关的情况下得到的情感平均识别率为67.41%,比线性加权融合、Dempster-Shafer(DS)证据理论、贝叶斯融合方法分别提高了4.2%、2.8%和2.4%,比融合前平均识别率提高了8.08%,该方法针对非特定人语音情感识别具有较好的鲁棒性及识别准确率。 According to the problem that sound burst features (laughter, cries, sighs, called functional paralanguages) contain a great deal of emotional information while the sentences containing emotional paralanguages have lower recognition accuracy, this paper proposes a method of speech emotion recognition fusing functional paralanguages. In this method, firstly the automatic detection of functional paralanguages is utilized for sentences. Then the functional paralanguages are separated from sentences based on detection results. Then two more pure types of signals:functional paralanguage and traditional speech are gotten. Finally, the emotional information of functional paralanguage and traditional speech is adaptively fused. The experimental results on speaker-independent emotion corpus containing six functional paralanguages and six typical emotions show that: average recognition rate of the proposed method is 67.41%, which is higher than the results of linear weighted fusion, Dempster-Shafer (DS) evidence theory, Bayesian fusion method and before the fusion by 4.2%, 2.8%, 2.4%and 8.08%. Thus, the method has better robustness and recognition accuracy for speaker independent speech emotion recognition.
出处 《计算机科学与探索》 CSCD 2014年第2期186-199,共14页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金 Grant No.61272211 江苏省自然科学基金 Grant No.BK2011521 江苏大学高级人才基金 Grant No.10JDG065~~
关键词 语音情感识别 功能性副语言 自动检测 自适应权重 融合识别 speech emotion recognition functional paralanguage automatic detection adaptive weight fusion recog-nition
  • 相关文献

参考文献4

二级参考文献41

  • 1张一彬,周杰,边肇祺,张大鹏.一种基于内容的音频流二级分割方法[J].计算机学报,2006,29(3):457-465. 被引量:7
  • 2Cheng S S, Wang H M. A sequential metric-based audio segmentation method via the Bayesian information eriterion [C] //Proceedings of Eurospeech, Geneva, 2003: 945-948.
  • 3Chen S S, Gopalakrishnan P. Speaker, environment and channel change detection and clustering via the Bayesian information criterion [C] //Proceedings of the DARPA Workshop, Lansdowne, 1998: 127-132.
  • 4Cettolo M, Vescovi M. Efficient audio segmentation algorithms based on the BIC [C] //Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, 2003:537-540.
  • 5Tritsehler A, Gopinath R. Improved speaker segmentation and segments clustering using the Bayesian information criterion [C]//Proceedings of the Eurospeech, Budapest, 1999 : 2997-3000.
  • 6Cettolo M, Vescovi M, Rizzi R. Evaluation of BIC based algorithms for audio segmentation [J]. Computer Speech and Language, 2005, 19(2) : 147- 170.
  • 7Sivakumaran P, Fortuna J, Ariyaeeinia A M. On the use of the Bayesian information criterion in multiple speaker detection[C] //Proceedings of the Eurospeech, Scandinavia, 2001:795-798.
  • 8Ajmera J, MeCowan I A, Bourlard H. Robust HMM based speech/music segmentation [C] //Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, 2002:297-300.
  • 9Gauvain J L, Lamel L, Adda G. The LIMSI broadcast news transcription system [J]. Speech Communication, 2002, 37 (1): 89-108.
  • 10Lu L, Li S Z, Zhang H J. Content-based audio segmentation using support vector machines [C] //Proceedings of International Conference on Multimedia and Expro, Tokyo, 2001 : 749-752.

共引文献10

同被引文献36

  • 1张一彬,周杰,边肇祺,张大鹏.一种基于内容的音频流二级分割方法[J].计算机学报,2006,29(3):457-465. 被引量:7
  • 2ISHI C T, ISHIGURO H, HAGITA N. Automatic extraction of para- linguistic information using prosodic features related to F0, duration and voice quality[ J]. SCI, Speech Communication 50, 2008 : 531 - 543.
  • 3CHENG S S, WANG H M.A. Sequential metric to based audio segmen- tation method via the Bayesian information criterion [ C]// Proceedings of Eurospeech. Geneva: University of Geneva, 2003:945 -948.
  • 4CHEN S S, GOPLALAKRISHNAN P. Speaker, environment and channel change detection and clustering via the Bayesian information criterion [ C ]// proceedings of the DARPA workshop. Lansdowne : [ s. n. ] , 1988 : 127 - 132.
  • 5CETI'OLO M, VESCOVI M. Efficient audio segmentation algorithms based on the BIC [ C ]//Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Hang Kong : IEEE, 2003 : 537 - 540.
  • 6Cettolo M, Vescovi M, Rizzi R. Evaluation of BIC based algorithms for audio segmentation [ J]. Computer Speech and Language, 2005, 19f2) : 147 -170.
  • 7MAO QiRong, WANG XiaoJia, ZHAN YongZhao. Speech emotion recognition method based on improved decision tree and layered fea- ture selection [ J ]. International Journal of Humanoid Robotics, 2010:245 - 261.
  • 8于俊清,胡小强,孙凯.改进的音频混合分割方法[J].计算机辅助设计与图形学学报,2010,22(7):1174-1181. 被引量:4
  • 9郑能恒,张亚磊,李霞.基于模型在线更新和平滑处理的音乐分割算法[J].深圳大学学报(理工版),2011,28(3):271-275. 被引量:2
  • 10秦海波,白延强,吴斌,王峻,刘学勇,景晓路.载人航天飞行中的情绪研究进展[J].航天医学与医学工程,2012,25(4):302-306. 被引量:12

引证文献5

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部