期刊文献+

Novel acoustic features for speech emotion recognition 被引量:2

Novel acoustic features for speech emotion recognition
原文传递
导出
摘要 This paper focuses on acoustic features that effectively improve the recognition of emotion in human speech.The novel features in this paper are based on spectral-based entropy parameters such as fast Fourier transform(FFT) spectral entropy,delta FFT spectral entropy,Mel-frequency filter bank(MFB) spectral entropy,and Delta MFB spectral entropy.Spectral-based entropy features are simple.They reflect frequency characteristic and changing characteristic in frequency of speech.We implement an emotion rejection module using the probability distribution of recognized-scores and rejected-scores.This reduces the false recognition rate to improve overall performance.Recognized-scores and rejected-scores refer to probabilities of recognized and rejected emotion recognition results,respectively.These scores are first obtained from a pattern recognition procedure.The pattern recognition phase uses the Gaussian mixture model(GMM).We classify the four emotional states as anger,sadness,happiness and neutrality.The proposed method is evaluated using 45 sentences in each emotion for 30 subjects,15 males and 15 females.Experimental results show that the proposed method is superior to the existing emotion recognition methods based on GMM using energy,Zero Crossing Rate(ZCR),linear prediction coefficient(LPC),and pitch parameters.We demonstrate the effectiveness of the proposed approach.One of the proposed features,combined MFB and delta MFB spectral entropy improves performance approximately 10% compared to the existing feature parameters for speech emotion recognition methods.We demonstrate a 4% performance improvement in the applied emotion rejection with low confidence score. This paper focuses on acoustic features that effectively improve the recognition of emotion in human speech. The novel features in this paper are based on spectral-based entropy parameters such as fast Fourier transform (FFT) spectral entropy, delta FFT spectral entropy, Mel-frequency filter bank (MFB) spectral entropy, and Delta MFB spectral entropy. Spectral-based entropy features are simple. They reflect frequency characteristic and changing characteristic in frequency of speech. We implement an emotion rejection module using the probability distribution of recognized-scores and rejected-scores. This reduces the false recognition rate to improve overall performance. Recognized-scores and rejected-scores refer to probabilities of recognized and rejected emotion recognition results, respectively. These scores are first obtained from a pattern recognition procedure. The pattern recognition phase uses the Gaussian mixture model (GMM). We classify the four emotional states as anger, sadness, happiness and neutrality. The proposed method is evaluated using 45 sentences in each emotion for 30 subjects, 15 males and 15 females. Experimental results show that the proposed method is superior to the existing emotion recognition methods based on GMM using energy, Zero Crossing Rate (ZCR), linear prediction coefficient (LPC), and pitch parameters. We demonstrate the effectiveness of the proposed approach. One of the proposed features, combined MFB and delta MFB spectral entropy improves performance approximately 10% compared to the existing feature parameters for speech emotion recognition methods. We demonstrate a 4% performance improvement in the applied emotion rejection with low confidence score.
出处 《Science China(Technological Sciences)》 SCIE EI CAS 2009年第7期1838-1848,共11页 中国科学(技术科学英文版)
基金 Supported by MIC,Korea under ITRC IITA-2009-(C1090-0902-0046) the Korea Science and Engineering Foundation(KOSEF) funded by the Korea government(MEST)(Grant No.20090058909)
关键词 SPEECH EMOTION RECOGNITION MFB SPECTRAL ENTROPY ENTROPY EMOTION RECOGNITION REJECTION speech emotion recognition MFB spectral entropy entropy emotion recognition rejection
  • 相关文献

参考文献40

  • 1Bhatti M W,Wang Y,Guan L.A neural network approach for human emotion recognition in speech. Proceedings of the2004Interna-tional Symposium on Circuits and Systems(ISCAS’04) . 2004
  • 2Lee C M,Narayanan S.Towards detecting emotions in spoken dia-logs. IEEE Transactions on Speech and Audio Processing . 2004
  • 3Dellaert F,Polzin T,Waibel A.Recognizing emotion in speech. Proceedings of Fourth International Conference on Spoken Language Processing(ICSLP’96) . 1996
  • 4Amir N.Classifying emotions in speech.A comparison of methods. Proceedings of European Conference on Speech Communication and Technology(EUROSPEECH’01) . 2001
  • 5Lee C M,Narayanan S,Pieraccini R.Recognition of negative emotions from the speech signal. Proceedings of IEEE Work-shop on Automatic Speech Recognition and Understanding . 2001
  • 6Altun H,Polat G.New Frameworks to Boost Feature Selection Alg-orithms in Emotion Detection for Improved Human Computer Interac-tion. Lecture Notes in Computer Science . 2007
  • 7Kim E H,Hyun K H,Kwak Y K.Improvement of emotion recogni-tion from voice by separation of obstruent. 15th IEEE International Symposium on Robut and Human Interactive Communication(RO-MAN06) . 2006
  • 8Kim E H,Hyun K H,Kim S H,et al.Speech emotion recognition using Eigen-FFT in clean and noisy environments. 16th IEEE Inter-national Conference on Robot&Human Interactive Communication . 2007
  • 9Borchert M,Dusterhoft A.Emotion in speech-experiments with prosody and quality features in speech for use in categorical and di-mensional emotion recognition environments. Natural Language Processing and Knowledge Engineering,IEEE NLP-KE′05.Pro-ceedings of2005IEEE International Conference on . 2005
  • 10Noda T,Yano Y,Doki S,et al.Adaptive emotion recognition in speech by feature selection based on KL-divergence. IEEE Interna-tional Conference on System,Man,and Cybernetics . 2006

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部