摘要
该文提出一种采用谱稳定性作为特征参数的区分语音与笑声的新方法。通过分析语音与笑声的谱稳定性参数的特性,发现前者明显小于后者,这表明谱稳定性可以作为区分语音与笑声的特征参数。比较了采用谱稳定性参数、Mel频率倒谱系数、感知线性预测和基音频率等特征参数在相同实验条件下区分语音与笑声的性能。实验结果表明:在特定人和非特定人情况下,采用谱稳定性作为特征参数区分语音与笑声的正确率分别为90.74%和73.63%,其区分能力优于其它特征参数。
This paper proposes a novel method which uses spectral stability as feature parameter to discriminate speech and laugh, It is found that the spectral stability of speech is obviously smaller than that of laugh, which indicates that the spectral stability can be used as a feature parameter to discriminate speech and laugh. The performance of discriminating speech and laugh by using Spectral Stability (SS), Mel-Frequency Cepstrum Coefficients (MFCC), Perceptual Linear Prediction (PLP) and pitch, are compared to each other in the same experiment conditions. The experiment results show that the accuracy are respectively 90.74% and 73.63% by using spectral stability as feature parameter to discriminate speech and laugh in the speaker-dependent and speaker-independent conditions, and the discrimination power of spectral stability is superior to the counterparts of other feature parameters.
出处
《电子与信息学报》
EI
CSCD
北大核心
2008年第6期1359-1362,共4页
Journal of Electronics & Information Technology
基金
国家自然科学基金(60572141)资助课题
关键词
自然口语语音识别
语音笑声区分
谱稳定性
语音事件
Spontaneous speech recognition
Speech laugh discrimination
Spectral stability
Speech events