融合功能性副语言的语音情感识别新方法被引量：5

New Method of Speech Emotion Recognition Fusing Functional Paralanguages

下载PDF

导出

摘要针对声音突发特征(笑声、哭声、叹息声等,称之为功能性副语言)携带大量情感信息,而包含这类突发特征的语句由于特征突发性的干扰整体情感识别率不高的问题,提出了融合功能性副语言的语音情感识别方法。该方法首先对待识别语句进行功能性副语言自动检测,根据检测结果将功能性副语言从语句中分离,从而得到较为纯净的两类信号:功能性副语言信号和传统语音信号,最后将两类信号的情感信息使用自适应权重融合方法进行融合,从而达到提高待识别语句情感识别率和系统鲁棒性的目的。在包含6种功能性副语言和6种典型情感的情感语料库上的实验表明:该方法在与人无关的情况下得到的情感平均识别率为67.41%,比线性加权融合、Dempster-Shafer(DS)证据理论、贝叶斯融合方法分别提高了4.2%、2.8%和2.4%,比融合前平均识别率提高了8.08%,该方法针对非特定人语音情感识别具有较好的鲁棒性及识别准确率。 According to the problem that sound burst features （laughter, cries, sighs, called functional paralanguages） contain a great deal of emotional information while the sentences containing emotional paralanguages have lower recognition accuracy, this paper proposes a method of speech emotion recognition fusing functional paralanguages. In this method, firstly the automatic detection of functional paralanguages is utilized for sentences. Then the functional paralanguages are separated from sentences based on detection results. Then two more pure types of signals：functional paralanguage and traditional speech are gotten. Finally, the emotional information of functional paralanguage and traditional speech is adaptively fused. The experimental results on speaker-independent emotion corpus containing six functional paralanguages and six typical emotions show that： average recognition rate of the proposed method is 67.41%, which is higher than the results of linear weighted fusion, Dempster-Shafer （DS） evidence theory, Bayesian fusion method and before the fusion by 4.2%, 2.8%, 2.4%and 8.08%. Thus, the method has better robustness and recognition accuracy for speaker independent speech emotion recognition.

作者赵小蕾毛启容詹永照

机构地区江苏大学计算机科学与通信工程学院中山大学新华学院

出处《计算机科学与探索》 CSCD 2014年第2期186-199,共14页 Journal of Frontiers of Computer Science and Technology

基金国家自然科学基金 Grant No.61272211 江苏省自然科学基金 Grant No.BK2011521 江苏大学高级人才基金 Grant No.10JDG065~~

关键词语音情感识别功能性副语言自动检测自适应权重融合识别 speech emotion recognition functional paralanguage automatic detection adaptive weight fusion recog-nition

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献4

1毛启容,赵小蕾,白李娟,王治锋,詹永照.结合过完备字典与PCA的小样本语音情感识别方法[J].江苏大学学报（自然科学版）,2013,34(1):60-65. 被引量：5
2赵文博,王艇艇,张生,孙国强.基于矢量量化的婴儿哭声识别算法[J].微计算机信息,2011,27(4):224-225. 被引量：2
3于俊清,胡小强,孙凯.改进的音频混合分割方法[J].计算机辅助设计与图形学学报,2010,22(7):1174-1181. 被引量：4
4李艳雄,贺前华,陈楠,齐朝晖.基于谱稳定性特征的语音与笑声区分新方法[J].电子与信息学报,2008,30(6):1359-1362. 被引量：3

二级参考文献41

1张一彬,周杰,边肇祺,张大鹏.一种基于内容的音频流二级分割方法[J].计算机学报,2006,29(3):457-465. 被引量：7
2Cheng S S, Wang H M. A sequential metric-based audio segmentation method via the Bayesian information eriterion [C] //Proceedings of Eurospeech, Geneva, 2003: 945-948.
3Chen S S, Gopalakrishnan P. Speaker, environment and channel change detection and clustering via the Bayesian information criterion [C] //Proceedings of the DARPA Workshop, Lansdowne, 1998: 127-132.
4Cettolo M, Vescovi M. Efficient audio segmentation algorithms based on the BIC [C] //Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, 2003:537-540.
5Tritsehler A, Gopinath R. Improved speaker segmentation and segments clustering using the Bayesian information criterion [C]//Proceedings of the Eurospeech, Budapest, 1999 : 2997-3000.
6Cettolo M, Vescovi M, Rizzi R. Evaluation of BIC based algorithms for audio segmentation [J]. Computer Speech and Language, 2005, 19(2) : 147- 170.
7Sivakumaran P, Fortuna J, Ariyaeeinia A M. On the use of the Bayesian information criterion in multiple speaker detection[C] //Proceedings of the Eurospeech, Scandinavia, 2001:795-798.
8Ajmera J, MeCowan I A, Bourlard H. Robust HMM based speech/music segmentation [C] //Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, 2002:297-300.
9Gauvain J L, Lamel L, Adda G. The LIMSI broadcast news transcription system [J]. Speech Communication, 2002, 37 (1): 89-108.
10Lu L, Li S Z, Zhang H J. Content-based audio segmentation using support vector machines [C] //Proceedings of International Conference on Multimedia and Expro, Tokyo, 2001 : 749-752.

共引文献10

1贺前华,李艳雄,李韬,张虹,杨继臣.基于两步判决的口语中非文字音频事件检测方法[J].华南理工大学学报（自然科学版）,2011,39(2):20-25. 被引量：1
2赵小蕾,赵慧青.说话人功能性副语音自动检测算法[J].智能计算机与应用,2015,5(1):73-76. 被引量：1
3朱春媚,黎萍.基于频段互相关系数的咳嗽识别新方法[J].计算机工程与应用,2016,52(2):161-164. 被引量：1
4冷娇娇,赵彤洲,方晖,李翔,李碧.基于方差稳定性度量的乐器音频分割算法[J].计算机工程与设计,2016,37(3):768-772. 被引量：4
5杨超,贺一君,任建存,宋家康,刘云飞.码本均衡矢量编码算法[J].现代电子技术,2016,39(13):38-40. 被引量：8
6吕宗磊,陈国明.基于观察学习的概率分布预测模型研究[J].计算机与数字工程,2016,44(9):1635-1640.
7曹维祥.语音情感分析与仿真[J].宿州学院学报,2018,33(4):107-111.
8李强,张玲,朱兰,明艳.一种甚低码率声码器的设计[J].重庆邮电大学学报（自然科学版）,2018,30(6):776-782. 被引量：1
9王方丽,傅嘉俊.基于Python的BIC语音分割算法的实现与应用[J].计算机与数字工程,2020,48(4):763-766. 被引量：3
10王莲子,李钟晓,陈倩倩,庄晓东.基于信号子空间低维表征的快速字典学习算法[J].传感器与微系统,2022,41(8):144-147. 被引量：2

同被引文献36

1张一彬,周杰,边肇祺,张大鹏.一种基于内容的音频流二级分割方法[J].计算机学报,2006,29(3):457-465. 被引量：7
2ISHI C T, ISHIGURO H, HAGITA N. Automatic extraction of para- linguistic information using prosodic features related to F0, duration and voice quality[ J]. SCI, Speech Communication 50, 2008 : 531 - 543.
3CHENG S S, WANG H M.A. Sequential metric to based audio segmen- tation method via the Bayesian information criterion [ C]// Proceedings of Eurospeech. Geneva: University of Geneva, 2003:945 -948.
4CHEN S S, GOPLALAKRISHNAN P. Speaker, environment and channel change detection and clustering via the Bayesian information criterion [ C ]// proceedings of the DARPA workshop. Lansdowne : [ s. n. ] , 1988 : 127 - 132.
5CETI'OLO M, VESCOVI M. Efficient audio segmentation algorithms based on the BIC [ C ]//Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Hang Kong : IEEE, 2003 : 537 - 540.
6Cettolo M, Vescovi M, Rizzi R. Evaluation of BIC based algorithms for audio segmentation [ J]. Computer Speech and Language, 2005, 19f2) : 147 -170.
7MAO QiRong, WANG XiaoJia, ZHAN YongZhao. Speech emotion recognition method based on improved decision tree and layered fea- ture selection [ J ]. International Journal of Humanoid Robotics, 2010:245 - 261.
8于俊清,胡小强,孙凯.改进的音频混合分割方法[J].计算机辅助设计与图形学学报,2010,22(7):1174-1181. 被引量：4
9郑能恒,张亚磊,李霞.基于模型在线更新和平滑处理的音乐分割算法[J].深圳大学学报（理工版）,2011,28(3):271-275. 被引量：2
10秦海波,白延强,吴斌,王峻,刘学勇,景晓路.载人航天飞行中的情绪研究进展[J].航天医学与医学工程,2012,25(4):302-306. 被引量：12

引证文献5

1赵小蕾,赵慧青.说话人功能性副语音自动检测算法[J].智能计算机与应用,2015,5(1):73-76. 被引量：1
2曹春香.语音特征和情感特征的翻译系统与实现[J].现代电子技术,2018,41(13):123-127. 被引量：1
3赵小蕾,许喜斌.融合浅层学习和深度学习模型的语音情感识别[J].计算机应用与软件,2020,37(12):108-112. 被引量：2
4罗德虎,冉启武,杨超,豆旺.语音情感识别研究综述[J].计算机工程与应用,2022,58(21):40-52. 被引量：8
5孙颖,周雅茹,张雪英.融合功能性副语言比例系数的语音情感识别[J].东北大学学报（自然科学版）,2024,45(1):40-48.

二级引证文献12

1王枫.基于4G通信的语音格式转换系统的设计与实现[J].数码设计,2020,9(24):44-45.
2张守叶.探究端到端的深度卷积神经网络语音识别[J].软件,2022,43(3):173-176. 被引量：2
3彭凯贝,孙小明,陈皓炜,王建荣.基于卷积神经网络的火车站语音情感识别方法[J].计算机仿真,2023,40(2):177-180. 被引量：2
4李紫荆,陈宁.基于图神经网络多模态融合的语音情感识别模型[J].计算机应用研究,2023,40(8):2286-2291. 被引量：4
5彭毛扎西,才智杰,才让卓玛.藏语情感语音数据库构建[J].北京大学学报（自然科学版）,2023,59(5):773-781. 被引量：1
6李良琦,张雪英,段淑斐,肖仲喆,贾海蓉,梁慧芝.普通话多模态情感语音数据库构建与评测[J].复旦学报（自然科学版）,2024,63(1):18-31.
7孙颖,周雅茹,张雪英.融合功能性副语言比例系数的语音情感识别[J].东北大学学报（自然科学版）,2024,45(1):40-48.
8张胜茂,李佳康,唐峰华,吴祖立,戴阳,樊伟.基于深度学习的鱼类养殖监测研究进展[J].农业工程学报,2024,40(5):1-13. 被引量：3
9董胡,彭高丰,李垣陵.中文儿童语音情感识别研究综述[J].特立研究,2024(2):16-21.
10王嘉文,高定国,索朗曲珍,尼琼.基于特征提示的跨语种语音识别模型[J].科学技术与工程,2024,24(24):10348-10355.

1王晓卉.副语言在言语交际中的特性及其语用启示[J].文教资料,2015(10):30-32. 被引量：1
2胡雯雯.婴儿哭声能破译[J].人民文摘,2014(2):68-68.
3陶照林.副语言特征对语义的补足作用[J].海军工程大学学报,1998,11(1):95-97.
4新知[J].现代苏州,2012(3):11-11.
5韩国良.歌唱中副语言的运用[J].中国科技纵横,2010(8):228-228.
6赵小蕾,赵慧青.说话人功能性副语音自动检测算法[J].智能计算机与应用,2015,5(1):73-76. 被引量：1
7姜少平.基于多模态研究《商业周刊》杂志封面[J].环球市场信息导报,2016,0(4):100-100.
8崔立秀.大学生多元识读能力研究及对英语教学的启示[J].科技信息,2012(35).
9薛益定.中文情感分析研究综述[J].电脑编程技巧与维护,2016(5):22-24. 被引量：5
10朱美玲.浅议非言语行为在小学数学教学中的运用[J].求知导刊,2015(17):135-135. 被引量：1

计算机科学与探索

2014年第2期

浏览历史

内容加载中请稍等...

融合功能性副语言的语音情感识别新方法被引量：5

参考文献4

二级参考文献41

共引文献10

同被引文献36

引证文献5

二级引证文献12

相关作者

相关机构

相关主题

浏览历史

融合功能性副语言的语音情感识别新方法 被引量：5

参考文献4

二级参考文献41

共引文献10

同被引文献36

引证文献5

二级引证文献12

相关作者

相关机构

相关主题

浏览历史

融合功能性副语言的语音情感识别新方法被引量：5