摘要
针对传统音频分类方法手动构造特征导致过程繁琐且准确率不高的问题,提出了一种基于改进的卷积神经网络和随机森林的音频分类方法。首先,将长音频数据分段;然后,对每段音频进行短时傅里叶变换,得到每段音频的频谱图;其次,将每段音频对应的频谱图输入到卷积神经网络中,自动提取音频的高层特征;最后,将提取的高层特征输入到随机森林,训练分类器。实验结果表明:与基于隐马尔可夫模型(HMM)的方法相比,该算法准确率提高了16. 2%;与基于支持向量机(SVM)的方法相比,准确率提高了12%。所提算法能够有效提高音频分类的准确率,且能自动提取音频高层特征,降低了特征构造的复杂度。
Focused on the issue that the traditional methods of audio classification based on features of manual construction are complicated and inaccurate,a new method based on improved convolutional neural network and random forest was proposed.Firstly,long audio data was divided into segments.Secondly,short-term Fourier transform was performed on each segment and corresponding frequency spectrum was obtained.Thirdly,frequency spectrum graph corresponding to each audio segment was input into the convolution neural network,and the high-level audio characteristics were automatically extracted.Finally,the extracted high-level features were used by random forest to train a classifier.The experimental results show that the accuracy of the proposed method was 16.2%higher than that of HMM(Hidden Markov Model)method,and was 12%higher than that of SVM(Support Vector Machine)method.The proposed algorithm can effectively improve the accuracy of audio classification,and automatically extract high-level features of audio which reduces the complexity of feature construction.
作者
付炜
杨洋
FU Wei;YANG Yang(College of Computer Science,Sichuan University,Chengdu Sichuan 610065,China;Sichuan Institute of Computer Sciences,Chengdu Sichuan 610041,China)
出处
《计算机应用》
CSCD
北大核心
2018年第A02期58-62,共5页
journal of Computer Applications
关键词
音频分类
频谱图
特征提取
分类器
深度学习
audio classification
spectrogram
feature detection
classifier
deep learning