期刊文献+

基于注意力机制的语音情感识别非线性特征融合方法的研究 被引量:2

A NONLINEAR FEATURE FUSION METHOD OF SPEECH EMOTION RECOGNITION BASED ON ATTENTION MECHANISM
下载PDF
导出
摘要 为了解决语音情感识别中时空特征动态依赖问题,提出一种基于注意力机制的非线性时空特征融合模型。模型利用基于注意力机制的长短时记忆网络提取语音信号中的时间特征,利用时间卷积网络提取语音信号中的空间特征,利用注意力机制将时空特征进行非线性的融合,并将非线性融合后的高级特征输入给全连接层进行语音情感识别。实验在IEMOCAP数据集中进行评估,实验结果表明,该方法可以同时考虑时空特征的内在关联,相对于使用线性融合的方法,利用注意力机制进行非线性特征融合的网络可以有效地提高语音情感识别准确率。 In order to solve the problem of dynamic dependence of spatiotemporal features in speech emotion recognition, a nonlinear spatiotemporal feature fusion model based on attention mechanism is proposed. The model used the long short-term memory network based on the attention mechanism to extract the time features in the speech signal, used the time convolution network to extract the spatial features in the speech signal, and used the attention mechanism to nonlinearly merge the spatial-temporal features. The advanced features after fusion were input to the fully connected layer for speech emotion recognition. The experiment was evaluated on the IEMOCAP data set. The experimental results show that the method can simultaneously consider the internal correlation of the spatial-temporal features. Compared with the linear fusion method, the network that uses the attention mechanism for nonlinear feature fusion can effectively improve the accuracy of speech emotion recognition.
作者 周伟东 周后盘 夏鹏飞 Zhou Weidong;Zhou Houpan;Xia Pengfei(College of Automation(Artificial Intelligence),Hangzhou Dianzi University,Hangzhou 310000,Zhejiang,China)
出处 《计算机应用与软件》 北大核心 2023年第1期216-221,272,共7页 Computer Applications and Software
关键词 语音情感识别 长短时记忆网络 时间卷积网络 非线性融合 Speech emotion recognition Long short-term memory network Time convolutional network Nonlinear fusion
  • 相关文献

参考文献1

二级参考文献1

共引文献18

同被引文献12

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部