摘要
歌曲中含有丰富的人类情感信息,而研究歌曲情感分类有助于对海量音乐数据进行组织和检索。事实上,从歌曲音频信号中可提取时域和频域内的多项特征参数。针对情感分类主题,提取了梅尔频率倒谱系数、过零率以及频谱质心等音频特征,分别将单一特征和融合特征输入分类器,以研究不同特征参数对情感分类的影响,并且以卷积神经网络作为特征选择层,构建了两种组合网络分类模型。实验证明,相较于传统的分类算法,CNN-LSTM组合模型在歌曲音频情感分类任务上具有更高的准确率。
Songs contain abundant human emotion information, and studying the emotional classification of songs helps to organize and retrieve massive music data. In fact, multiple feature parameters in the time domain and frequency domain can be extracted from the song audio signal. Aiming at the theme of sentiment classification, the audio features such as Mel frequency cepstrum coefficient, zero-crossing rate and spectral centroid are extracted. The single feature and fusion feature are input into the classifier, so as to study the influence of different feature parameters on sentiment classification. Moreover, by using convolutional neural network as feature selection layers, two classification models of combinatorial networks are constructed. Experiments indicate that the CNN-LSTM combinatorial model has even higher accuracy in song audio sentiment classification tasks as compared with the traditional classification algorithms.
作者
陈长风
CHEN Chang-feng(School of Computer,Hangzhou Dianzi University,Hangzhou Zhejiang 310018,China)
出处
《通信技术》
2019年第5期1114-1118,共5页
Communications Technology