期刊文献+

基于注意力机制的BiLSTM动物声音情绪识别

Emotion recognition of animal voices based on attention mechanism BiLSTM
下载PDF
导出
摘要 声音是动物向外界表达情绪的一种重要方式,通过提取动物声音特征,建立特征值与动物情绪之间的映射关系,可以实现对动物情绪的感知和理解。为提高动物情绪识别性能,本文提出基于Bahdanau注意力机制的双向长短期记忆网络的动物声音情绪识别方法。该方法对动物声音进行特征提取,提取了的频谱质心、频谱带宽、频谱滚降点、过零率、均方根能量、频谱对比度、梅尔倒谱系数以及其一阶差分作为特征向量,输入双向长短期记忆网络,通过注意力机制对情绪特征进行通道方向的权重学习,最后由全连接层进行情感类型判别。本文以狗为例,对狗的声音进行了情绪识别实验,实验结果表明:相比于循环神经网络、双向长短期记忆网络,本文方法的识别准确度更高。 Sound serves as a crucial mean for animals to express their emotions to the outside world.By establishing the mapping relationship between animal′s emotions and features extracted from animal sound,it becomes possible for computers to perceive and understand the emotional states of animals.In order to improve the performance of animal emotion recognition,a method for recognizing emotions in animal sounds based on the Bi-directional Long Short-Term Memory(BiLSTM)network with Bahdanau attention mechanism is presented in this paper.Feature extraction of the proposed method such as the spectral centroid,spectral bandwidth,spectral rolloff point,zero-crossing rate,root mean square energy,spectral contrast,Mel-frequency cepstral coefficients and their first-order differences,forming a feature vector from animal sound.The feature vector is treated as the input of BiLSTM network.Through the attention mechanism,the proposed method learns channel-wise weights for emotional features.Ultimately,a fully connected layer is utilized for the classification of emotional categories.Taking dogs as an example,experiments are conducted to recognize emotions in dog sounds.The experimental results demonstrate that the proposed method outperforms the methods based on Recurrent Neural Networks and BiLSTM networks with higher accuracy in emotion recognition.
作者 胡文星 蔡佳欣 柯振宇 彭烁钟 胡松 赵小燕 HU Wenxing;CAI Jiaxin;KE Zhenyu;PENG Shuozhong;HU Song;ZHAO Xiaoyan(School of Information and Communication Engineering,Nanjing Institute of Technology,Nanjing 211167,China)
出处 《智能计算机与应用》 2024年第7期57-63,共7页 Intelligent Computer and Applications
基金 江苏省大学生实践创新训练计划项目(202311276087Y) 南京工程学院引进人才科研启动基金项目(YKJ202019)。
关键词 动物情绪识别 Bahdanau注意力机制 双向长短期记忆网络 特征提取 animal emotion recognition Bahdanau attention mechanism bidirectional long short-term memory network feature extraction
  • 相关文献

参考文献8

二级参考文献113

  • 1王振浩,杜凌艳,李国庆,高树永.动态时间规整算法诊断高压断路器故障[J].高电压技术,2006,32(10):36-38. 被引量:30
  • 2吕军,曹效英.基于语音识别的汉语发音自动评分系统的设计与实现[J].计算机工程与设计,2007,28(5):1232-1235. 被引量:12
  • 3NICHOLSON J. Emotion recognition in speech using neural networks[J]. Neural Information Processing, 1999,2: 495-501.
  • 4BHATTI M W. A neural network approach for human emotion recognition in speech [J]. Circuits and Systems,2004,2 : II-181-4.
  • 5PARK Chang-hyun,SIM Kwee-bo.Emotion recognition and acoustic analysis from speech signal [C]// Proceedings of the International Joint Conference on Neural Networks. [S.l.] : IEEE Press, 2003,4 : 2594-2598.
  • 6SUN X.A pitch determination algorithm based on sub- harmonic-to-harmonic ratio [C]// Proceedings of the 6th International Conference on Spoken Language Processing. Beijing: [s.n.], 2000,4:676-679.
  • 7RONG Jia, CHEN Yi-Ping P. Acoustic features extraction for emotion recognition[C]// Proceedings of the 6th IEEE/ ACIS International Conference on Computer and Information Science, ICIS 2007. [S.l.] : IEEE Press, 2007 : 419 - 424.
  • 8GAO Hui, CHEN Shan-guang. Emotion classification of mandarin speech based on TEO nonlinear features [J]. Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing,2007,3:394-398.
  • 9PAO Tsang-long. Combination of multiple classifiers for improving emotion recognition in mandarin speech [J]. Intelligent Information Hiding and Multimedia Signal Processing, 2007,1 : 35-38.
  • 10van Bezooijen R,Otto SA,Heenan TA. Recognition of vocal expressions of emotion:A three-nation study to identify universal characteristics[J].{H}JOURNAL OF CROSS-CULTURAL PSYCHOLOGY,1983,(04):387-406.

共引文献204

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部