期刊文献+

基于谱稳定性特征的语音与笑声区分新方法 被引量:3

Spectral Stability Feature Based Novel Method for Discriminating Speech and Laughter
下载PDF
导出
摘要 该文提出一种采用谱稳定性作为特征参数的区分语音与笑声的新方法。通过分析语音与笑声的谱稳定性参数的特性,发现前者明显小于后者,这表明谱稳定性可以作为区分语音与笑声的特征参数。比较了采用谱稳定性参数、Mel频率倒谱系数、感知线性预测和基音频率等特征参数在相同实验条件下区分语音与笑声的性能。实验结果表明:在特定人和非特定人情况下,采用谱稳定性作为特征参数区分语音与笑声的正确率分别为90.74%和73.63%,其区分能力优于其它特征参数。 This paper proposes a novel method which uses spectral stability as feature parameter to discriminate speech and laugh, It is found that the spectral stability of speech is obviously smaller than that of laugh, which indicates that the spectral stability can be used as a feature parameter to discriminate speech and laugh. The performance of discriminating speech and laugh by using Spectral Stability (SS), Mel-Frequency Cepstrum Coefficients (MFCC), Perceptual Linear Prediction (PLP) and pitch, are compared to each other in the same experiment conditions. The experiment results show that the accuracy are respectively 90.74% and 73.63% by using spectral stability as feature parameter to discriminate speech and laugh in the speaker-dependent and speaker-independent conditions, and the discrimination power of spectral stability is superior to the counterparts of other feature parameters.
出处 《电子与信息学报》 EI CSCD 北大核心 2008年第6期1359-1362,共4页 Journal of Electronics & Information Technology
基金 国家自然科学基金(60572141)资助课题
关键词 自然口语语音识别 语音笑声区分 谱稳定性 语音事件 Spontaneous speech recognition Speech laugh discrimination Spectral stability Speech events
  • 相关文献

参考文献13

  • 1Rose R C and Riccardi G. Modeling disfluency and background events in ASR for a natural language understanding task. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, Phoenix, AZ, USA, March 15-19, 1999, vol.1: 341-344.
  • 2Stouten F, Duchateau J, and Martens J P, et al.. Coping with disfluencies in spontaneous speech recognition: acoustic detection and linguistic context manipulation. SpeechCommunication, 2006, vol.48: 1590-1606.
  • 3Chinese Linguistic Data Consortium. http://www.chineseldc. org/resourse.asp.
  • 4Cal R, Lie L, Zhang H J, and Cai L H. Highlight sound effects detection in audio stream. In Proc. of the IEEE International Conference on Multimedia and Expo, Baltimore, USA, July 6-9, 2003, Vol.3: 37-40.
  • 5Lockerd A and Mueller F. LAFCam Leveraging affective feedback camcorder. In Proc. of the CHI 2002 Conference on Human Factors in Computing Systems, Minneapolis, USA, 2002: 574-575.
  • 6Kennedy L S and Ellis D P W. Laughter detection in meetings. In NIST ICASSP 2004 Meeting Recognition Workshop, Montreal, Canada, 2004: 11-14.
  • 7Ito A, Wang Xinyue, Suzuki M, and Makino S. Smile and laughter recognition using speech processing and face recognition from conversation video. In Proc. of the 2005 International Conference on Cyberworlds, Nanyang Executive Centre, Singapore, November 23-25, 2005: 437-444.
  • 8Truong K P and van Leeuwen D A. Automatic discrimination between laughter and speech. Speech Communication,2007, 49(2): 144-158.
  • 9Hermansky H. Perceptual linear predictive (PLP) analysis of speech. Journal of Acoustic society of America, 1990, 87(4): 1738-1752.
  • 10Sun Xuejing. Pitch determination and voice quality analysis using subharmonic-to-harmonic ratio. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, Florida, USA, May 2002, Vol.l: 333-336.

同被引文献27

  • 1Stouten F,Duchateau J, Martens J P, et al. Coping with disfluencies in spontaneous speech recognition: acoustic detection and linguistic context manipulation [ J ]. Speech Communication ,2006,48 ( 11 ) : 1590-1606.
  • 2Cai R,Lu L,Zhang H J,et al. Highlight sound effects detection in audio stream [C]//Proceedings of IEEE International Conference on Multimedia and Expo. Baltimore: IEEE, 2003 : 37-40.
  • 3Kennedy L S, Ellis D P W. Laughter detection in meetings [ C ]//Proceedings of NIST International Conference on Acoustics, Speech and Signal Processing (Meeting Recognition Workshop ). Montreal: The National Institute of Standard and Technology ,2004:118-121.
  • 4Knox M T, Mirghafori N. Automatic laughter detection using neural networks [ C ]//Proceedings of InterSpeech. Antwerpen:International Speech Communication Association, 2007:2973-2976.
  • 5Laskowski K, Schuhz T. Detection of laughter-in-interaction in muhichannel close-talk microphone recordings of meetings [ C ] // Proceedings of the 5th International Workshop on Machine Learning for Muhimodal Interaction. Utrecht : Springer-Verlag ,2008 : 149-160.
  • 6Knox M T, Morgan N, Mirghafori N. Getting the last laugh: automatic laughter segmentation in meetings [ C ]//Proceedings of InterSpeech. Brisbane: International Speech Communication Association,2008:797-800.
  • 7Garg G, Ward N. Detecting filled pauses in tutorial dialogs [R]. EI Paso : Department of Computer Science, University of Texas at EI Paso,2006 : 1-9.
  • 8Audhkhasi K, Kandhway K, Deshmukh O D, et al. Formant-based technique for automatic filled-pause detection in spontaneous spoken English [C]//Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. Taipei : IEEE, 2009:4857-4860.
  • 9Li Y X, He Q H, Kwong S, et al. Characteristics-based effective applause detection for meeting speech [ J ]. Signal Processing ,2009,89 ( 8 ) : 1625-1633.
  • 10Carter A. Automatic acoustic laughter detection [ D ]. Staffordshire: Department of Electronic Engineering, Keele Universtiy ,2000.

引证文献3

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部