摘要
该文提出了一种新的多观测复合特征(MO-CF)用于基于支持向量机(SVM)的语音端点检测(VAD)。该特征是由2个子特征经平衡因子加权构成。特征的优化目标是寻找能使VAD的性能曲线下面积(AUC)最大化的平衡因子,以综合各个子特征的优点。在子特征选择方面,要求各个子特征不仅本身具有较好的性能,而且存在互补性。针对该要求,提出2种组合特征MO-CF1和MO-CF2。由多观测信噪比(MO-SNR)特征与多观测最大概率(MO-MP)特征复合而成的MO-CF2比MO-CF1更稳健。实验结果表明:在多种噪声环境下,相比于已有的9种VAD算法,该算法具有更好的性能和更高的稳健性。
A multiple observation compound feature(MO-CF) is presented for support vector machine(SVM) based statistical voice activity detection(VAD).The MO-CF is composed of at least two sub-features with balancing factors.The optimal balanced factor is chosen to yield the largest area under the ROC curve(AUC) of the performance.The selected sub-features must not only have good performance themselves but also be complementary with each other.A multiple-observation signal-to-noise ratio sub-feature is then combined with a multiple-observation maximum probability sub-feature to achieve more robust performance.Tests show that the algorithm gives better performance than 9 commonly used VAD techniques for various noisy scenarios with low SNRs.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2011年第9期1209-1214,共6页
Journal of Tsinghua University(Science and Technology)