摘要
有效地检测出流行音乐中的歌唱部分对在海量数据库中进行音乐检索、浏览、归类,以及旋律提取和歌唱家识别等有较大的价值.本文使用在语音信号处理中广泛使用的基于梅尔频率的倒谱系数(MFCC)作为语音特征来分析所要处理的音乐信号,并采用高斯混合模型(GMM)的建模方法分别为音乐中的伴奏部分(non-vocal)和歌唱部分(vocal)建立相应的GMM,进而实现音乐中歌唱部分的智能检测.与传统的只用一组手工标示了vocal和non-vocal的训练数据分别为两类数据训练一个GMM的方法相比较,本文在此基础上,再分别用一组纯歌唱部分数据和一组纯伴奏部分数据为每类建立一个GMM,然后将上述得到的两个vocalGMMs和non-vocalGMMs进行线性组合得到表示每类的概率模型.本文使用似然概率分类器作为系统的决策函数.实验结果表明该方法能够有效提高系统的识别性能.
Effective detection of vocal segment in popular music is very valuable in many applications, such as music retrieval, browsing. and cataloguing in a large database, melody extraction and singer recognition. The feature vector used to analyze the music signal in this paper is Mel - Frequency Cepstral Coeffcient which is generally used in speech signal processing. The methods of training GMM are applied to create the corresponding GMM for the non - vocal and vocal in a music respectively, which can achieve the goal of intelligent detection of vocal segment. In contrast to the conventional GMM training methods, using one group of training data which are hand - labeled as non - vocal and vocal to create one model for each class. On the other hand,in this paper, another GMM is created for each class using a group of pure vocal data and a group of pure non - vocal data respectively. And the probability models are obtained through the means of linear combination of the two GMMs of each class. The decision function used in this paper is likelihood probability classifier. The experiment results show that this method can improve the performance of the detection.
出处
《小型微型计算机系统》
CSCD
北大核心
2009年第5期1017-1020,共4页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(60702071)资助
四川省科技厅应用基础研究基金数字免疫网络研究项目(2006J13~065)资助
教育部新世纪优秀人才支持计划项目(NCET-06-0811)资助