摘要
语音情感识别的主要目的是对语音信号按照不同的情感进行分类,比如生气、恐惧、厌恶、高兴等。探究语音情感识别的任务,使用的方法是在小的语音区间上计算的一系列声学特征训练的深度递归神经网络。同时,使用CTC损失函数考虑到了同时包含情绪化和中性成分的长话语。在IEMOCAP语料库上设置对照实验,验证了该方法的高性能。
The main purpose of speech emotion recognition is to classify speech signals according to different emotions,such as anger,fear,disgust,and happiness.In this paper the task of emotion recognition from speech is considered.Proposed approach uses deep recurrent neural network trained on a sequence of acoustic features calculated over small speech intervals.At the same time CTC loss function allows to consider long utterances containing both emotional and neutral parts.A control experiment was set up on the IEMOCAP Corpus to verify high performance of the method.
作者
余华
颜丙聪
YU Hua;YAN Bingcong(Jiangsu Open University,Nanjing 210065,China;School of Information Engineering,Southeast University,Nanjing Jiangsu 210096,China)
出处
《电子器件》
CAS
北大核心
2020年第4期934-937,共4页
Chinese Journal of Electron Devices
基金
国家自然科学基金项目(61673108)。
关键词
递归神经网络
CTC损失函数
语音情感识别
recurrent neural network
CTC loss function
Speech emotion recognition