摘要
【目的】针对音乐信息检索中的声乐分类问题,将音频的统计特征和图像特征进行融合,探索效果更好的分类模型。【方法】抽取音频信息的统计特征以及梅尔频谱图图像特征。将机器学习方法用于统计特征,并设计了一种多层卷积神经网络架构用于图像特征,将声乐分类问题转化为图像分类问题,最后提出一种融合统计特征和图像特征的深度学习方法。【结果】在声乐分类任务上,基于图像特征的深度学习方法比机器学习方法 F1值提高约6个百分点,基于特征融合的深度学习模型F1值可达到69%以上,超过基于图像特征的深度学习模型3.4个百分点。【局限】实验数据量较小,未能完全发挥深度学习方法的优势。【结论】梅尔频谱图采样参数的设置对深度模型实验结果有较大影响,本文提出的特征融合方法可以有效提升声乐分类性能。
[Objective] This paper creates a new model combining the statistical characteristics of audio and image properties, aiming to address the classification issues facing music retrieval. [Methods] First, we extracted the statistical characteristics of audios and the Mel spectrogram characteristics of images with the help of machine learning methods. Then, we transformed the audio classification tasks to image categorization. Finally, we constructed a deep learning method combining audio statistics and Mel spectrogram image features. [Results] In vocal music classification, the F1 value of the new method based on image features was about 6 percentage points higher than that of the classic machine learning methods. The F1 value of the deep learning model based on feature fusion was more than 69%, which is 3.4 percentage points higher than that of the model with image features. [Limitations] The size of experimental data is small, and the advantages of deep learning methods were not fully utilized. [Conclusions] The setting of the sampling parameters of the Mel spectrogram influences the experimental results. The new feature fusion method can effectively improve the performance of vocal music classification.
作者
孟镇
王昊
虞为
邓三鸿
张宝隆
Meng Zhen;Wang Hao;Yu Wei;Deng Sanhong;Zhang Baolong(School of Information Management,Nanjing University,Nanjing 210023,China;Jiangsu Key Laboratory of Data Engineering and Knowledge Service,Nanjing 210023,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2021年第5期59-70,共12页
Data Analysis and Knowledge Discovery
基金
国家社会科学基金重大招标项目(项目编号:17ZDA291)的研究成果之一。
关键词
声乐分类
卷积神经网络
特征融合
音乐信息检索
梅尔频谱图
Vocal Music Classification
CNN
Feature Fusion
Music Information Retrieval
Mel-Frequency Cepstrum