期刊文献+

基于可分离卷积与LSTM的语音情感识别研究 被引量:9

Speech emotion recognition based on separable convolution and LSTM
下载PDF
导出
摘要 语音情感识别是人机交互领域的一个研究热点。针对普通卷积神经网络参数量过大和不能较好地处理时序信息的问题,文中给出将可分离卷积与LSTM应用于语音情感识别的方法,在RAVDESS情感语料库上进行了验证,利用MFCC特征训练的1D Sep-CNN-LSTM模型获得了90.77%的识别准确率,模型压缩了约40%。利用语谱图特征训练的2D Sep-CNN-LSTM模型获得了82.21%的识别准确率,模型压缩了约75%。实验表明,该方法相较其他模型在语音情感识别应用上有一定的优越性,适合于实时下位机的应用。 Speech emotion recognition is a research hotspot in the field of human-computer interaction.Aiming at the problem that the parameter volume of ordinary convolutional neural networks is too large and cannot deal with time series information well,a method of applying separable convolution and LSTM to speech emotion recognition is proposed in this paper,which is verified on the RAVDESS database.The feature-trained 1D Sep-CNN-LSTM model achieved 90.77%recognition accuracy,and the model was compressed by about 40%.The 2D Sep-CNN-LSTM model trained using the features of the spectrogram obtained a recognition accuracy of 82.21%,and the model is compressed by about 75%.Experiments show that this method is superior to other models in speech emotion recognition applications,and it is suitable for real-time lower computer applications.
作者 李文杰 罗文俊 李艺文 苏成悦 陈玉怀 曹越 LI Wen-jie;LUO Wen-jun;LI Yi-wen;SU Cheng-yue;CHEN Yu-huai;CAO Yue(School of Information Engineering,Guangdong University of Technology,Guangzhou 510006,China;School of Physics and Optoelectronic Engineering,Guangdong University of Technology,Guangzhou 510006,China)
出处 《信息技术》 2020年第10期61-66,共6页 Information Technology
基金 中山市重大科技专项(2016A1003)。
关键词 语音情感识别 可分离卷积 LSTM MFCC 语谱图 speech emotion recognition separable convolution LSTM MFCC spectrogram
  • 相关文献

参考文献2

二级参考文献4

共引文献12

同被引文献77

引证文献9

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部