摘要
语音情感识别在人机交互、人工智能(AI)、自然语言处理(NLP)、5G技术等方面扮演着重要的角色。为了克服单模态模型语音情感识别率低和手工调参的缺点,本文首先在Gaurav Sahu的基础模型上增加KNN、CNB和Adaboost单模态模型,提出多模态组合模型C3;然后应用排列组合方法通过计算机实现自动组合,克服Gaurav Sahu手工组合存在的不足;最后用超参数优化方法和交叉验证方法对网络模型进行训练和测试,解决手工调参存在的不足。在IEMOCAP数据集上对本文提出的C3进行实验,实验结果表明,C3比Gaurav Sahu提出的多模态组合模型E2的语音情感识别性能提升1.56%。
Speech emotion recognition plays an important role in human-computer interaction,artificial intelligence(AI),natural language processing(NLP),5G technology and so forth.In order to overcome the shortcomings of low speech emotion recognition rate and manual parameter tuning in single modal model,this paper first proposes a multi-modal combination model C3 by adding KNN,CNB and Adaboost single modal model on the basic models of Gaurav Sahu.Then,the method of permutation and combination is applied to realize automatic combination by computer to overcome the shortcomings of Gaurav Sahu manual combination.Finally,the network model is trained and tested by hyper-parameter optimization method and cross-validation method to solve the shortcomings of manual parameter adjustment.Experiments on IEMOCAP dataset show that the performance of multi-modal combination model C3 is 1.56%better than that of Gaurav Sahu's multi-modal combination model E2.
作者
陈军
王力
徐计
CHEN Jun;WANG Li;XU Ji(College of Big Data and Information Engineerin,Guizhou University,Guiyang 550025,Guizhou;College of Information Engineering,Guizhou Institute of Engineering Application Technology,Bijie 551700,Guizhou)
出处
《软件》
2019年第12期56-60,214,共6页
Software
基金
国家自然科学基金项目(项目名:基于引领树结构的多粒度大数据分析理论与方法,批准号:61966005)
贵州省教育厅创新群体重大研究项目(项目名:精准扶贫多源异构数据融合研究与平台建设,批准号:黔教合KY字[2016]057)
关键词
单模态模型
多模态组合模型
超参数优化
语音情感识别
交叉验证
自动组合
Single modal model
Multi-modal combination model
Hyper-parameter optimization
Speech emotion recognition
Cross-validation
Automatic combination