基于并联型神经网络的环境声音分类

Environmental sound classification based on parallel type neural network

下载PDF

导出

摘要针对传统单输入模型在环境声音分类中准确率不高的问题,提出一种基于时域特征和频域特征并联型特征融合神经网络。在该网络中,首先通过数据增强的方法来处理原始音频;其次处理后的原始音频数据和梅尔(Mel)频谱特征数据分别送入原始波形网络和Mel频谱网络,得到其时域和频谱特征后,进行特征融合;最后,将特征融合后的结果送入SoftMax分类器进行分类。本文在UrbanSound8K数据集上进行了实验验证,最终分类准确率高达96.03%,优于其他模型。 Aiming at the problem of low accuracy of traditional single input model in environmental sound classification,a parallel feature fusion neural network based on time domain features and frequency domain features is proposed.In this network,firstly,the original audio is processed by data enhancement method;and then,the processed original audio data and Mel spectrum feature data are sent to the original waveform network and Mel spectrum network,respectively,after obtaining the time domain and spectrum domain features,the feature fusion is performed.Finally,the result is sent to SoftMax classifier for classification after feature fusion.Experimental verification is carried out on UrbanSound8K dataset,and the final classification accuracy is up to 96.03%,which is prior to other models.

作者覃镜涛高瑜翔 QIN Jingtao;GAO Yuxiang(College of Communication Engineering,Chengdu University of Information Technology,Chengdu 610225,China;Key Laboratory of Meteorological Information and Signal Processing in Universities of Sichuan Province,Chengdu 610225,China)

机构地区成都信息工程大学通信工程学院气象信息与信号处理四川省高校重点实验室

出处《传感器与微系统》 CSCD 北大核心 2024年第7期106-109,113,共5页 Transducer and Microsystem Technologies

基金四川省教育厅高校创新团队项目(15TD0022)。

关键词并联型神经网络特征融合环境声音分类 parallel neural network feature fusion environmental sound classification

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]