摘要
为提升音乐情感识别的准确率,提出基于中高层特征的音乐情感识别模型,摒弃频谱特性、色度、谐波系数等低层特征,以更接近于人认知的中高层特征包括和弦、节拍、速度、调式、乐器种类、织体、旋律走势等作为情感识别模型的输入。建立一个包含385个音乐片断的数据集,将音乐情感识别抽象为一个回归问题,采用机器学习算法进行学习,预测音乐片段的8维情感向量。实验结果表明,相比低层特征,采用中高层特征作为输入时的准确率R2能够从59.6%提高至69.8%。
To improve the accuracy rate of music emotion recognition, a model based on middle and high level features was pro- posed. Low level features such as spectrum features, chromagram, MFCC were fully avoided in this model. Instead, middle and high level features including chords, meter, texture, instrument, mode, tempo and melody, which were more related to human's perception on music emotion, were adopted in the model. A dataset containing 385 music clips was constructed to verify the proposed model, where music emotion recognition was formulated to be a regression problem. Results show that by applying middle and high level features rather than low level features, the accuracy of the music emotion model can be improved from 59.6% to 69.8%.
出处
《计算机工程与设计》
北大核心
2017年第4期1029-1034,共6页
Computer Engineering and Design
关键词
音乐情感识别
中高层特征
机器学习
超级梯度提升算法
情感计算
music emotion recognition
middle and high level musical features
machine learning
extreme gradient boosting (xg-boost)
affective computing