摘要
文章针对语音情感识别领域的复杂性,研究基于深度学习的情感识别框架。首先,用梅尔频谱系数进行特征提取,并引入音频数据增强方法。其次,采用长短时记忆网络(Long Short Term Memory,LSTM)方法进行情感识别。最后,利用瑞尔森情感语音和歌曲视听数据库(Ryerson Audio Visual Database of Emotional Speech and Song,RAVDESS)对该方法进行测试。实验结果表明,该方法能够准确地对语音样本进行分类。
The article focuses on the complexity of speech emotion recognition and studies a deep learning based emotion recognition framework.Firstly,feature extraction is performed using Mel spectral coefficients,and audio data augmentation methods are introduced.Secondly,the Long Short Term Memory(LSTM)method is used for emotion recognition.Finally,the method was tested using the Ryerson Audio Visual Database of Emotional Speech and Song(RAVDESS).The experimental results show that this method can accurately classify speech samples.
作者
白玉杰
丁汨
BAI Yujie;DING Mi(Zhengzhou Shuqing Medical College,Zhengzhou Henan 450000,China)
出处
《信息与电脑》
2024年第4期129-131,共3页
Information & Computer
关键词
深度学习
情感分析
语音对话
deep learning
sentiment analysis
voice dialogue