期刊文献+

基于改进卷积神经网络与听觉谱图的乐器识别 被引量:3

Musical Instrument Identification Based on Improved Convolutional Neural Network and Auditory Spectrum
下载PDF
导出
摘要 针对传统乐器识别需要音乐的低级声频特征及识别性能依赖特征选取的问题,利用接近人耳感知且低冗余度的听觉谱图作为5层深度卷积网络的输入,逐层抽象出音色的高级时频表示用于乐器识别。为有效捕获听觉谱图中的时频信息,将卷积网络第1层矩形卷积核改进为频率、时间轴上的多尺度卷积核。在IOWA乐器库上进行的仿真实验结果表明,该神经网能获得96. 95%的识别准确率,优于使用单一卷积核的神经网,在相同的网络结构下,基于听觉谱图得到的识别准确率较基于梅尔频率倒谱系数(MFCC)、语谱图分别高出9. 11%、3. 54%,且对打击乐器与同族乐器的错分率均较小。 Aiming at the problem that traditional musical instrument identification depends on feature selection and elementary acoustical feature,a 5-layer Convolutional Neural Network(CNN)extracting high-level time-frequency information of timbre layer by layer is proposed,whose input is auditory spectrum containing harmonic information and close to human perception.The mono convolution kernel of first layer is improved by multi-scale kernel of time and frequency axises to effectively extract time-frequency information from auditory spectrum.Experimental results on IOWA database show that using the improved multi-scale convolution kernel can achieve 96.95%recognition accuracy,which is better than using a mono convolution kernel.Under the same network structure,the recognition accuracy obtained by using the auditory spectrum is 9.11%and 3.54%higher than the Mel-Frequency Cepstral Coefficient(MFCC)and spectrogram,respectively,and the misclassification rate of percussion instruments and kindred instruments are 2%and 3.1%,which are less than MFCC and spectrogram.
作者 王飞 于凤芹 WANG Fei;YU Fengqin(School of Internet of Things Engineering,Jiangnan University,Wuxi,Jiangsu 214100,China)
出处 《计算机工程》 CAS CSCD 北大核心 2019年第1期199-205,共7页 Computer Engineering
基金 国家自然科学基金(61703185)
关键词 听觉谱图 卷积神经网络 卷积核 时频特征 乐器识别 auditory spectrum Convolutional Neural Network(CNN) convolution kernel time-frequency feature musical instrument identification
  • 相关文献

参考文献1

二级参考文献11

共引文献8

同被引文献15

引证文献3

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部