摘要
音频比对有别于语音识别,音频比对不存在音频重构.在保证音频主要信息不丢失的前提下,采用二阶Haar小波变换压缩原始音频,以音频帧的方式提取出能代表音频主要信息特征的质心、均方根和前12个Mel倒谱系数,并分别计算这3类参数的欧氏距离,根据欧氏距离的值与阀值ε之间的关系,完成音频间的比对任务.经实践证明,这套方案对于音频比对具有较高的准确性和较好的实时性.
Audio comparison, different from speech recognition, does not have the necessity of audio reconstruction. Under keeping the major audio information, the original audio frequency is compressed in the way of Haar wavelet transform . Based on audio frames, the centroids reflect the features of major audio information, RMS, and the first 12 Mel-Frequency Cepstral Coefficients are extracted, and the Euclidean Distance of these three parameters is computed respectively. Finally audio comparison is accomplished according to the relationship between the value of Euclidean Distance and Threshold Value . The experiments show that this algorithm has a nigh accuracy and efficiency. It will play an active role in computer-bases audio rec- ognition and speech recognition.
出处
《河南师范大学学报(自然科学版)》
CAS
CSCD
北大核心
2006年第2期35-38,共4页
Journal of Henan Normal University(Natural Science Edition)
基金
四川省教育厅青年重点基金项目(2002A117)
关键词
小波变换
音频参数
欧氏距离
音频相似度
wavelet transform
audio parameter
Euclidean Distance
audio similarity