基于PANNs-CNN的环境声音分类算法研究及应用

Research and application of environmental sound classification algorithm based on PANNs-CNN

下载PDF

导出

摘要环境声音分类(ESC)技术主要涉及声音特征提取和分类器算法的选择。为了探索最佳的特征提取方法和分类器组合,文章对深度学习模型PANNs-CNN进行了研究和分析,对不同的特征提取方法进行了实验对比。实验结果表明,在与同类模型对比中,选用预训练且更深层的CNN模型可以提高ESC的预测性能;Log-Mel特征可以更好地保留声音信号高维度特征及特征相关性,有助于提升模型分类准确率。文章研究的基于Log-Mel特征提取方式和PANNs-CNN 14的环境声音分类算法在ESC-50数据集上的分类准确率最好,并且在实际应用中验证了该算法的有效性。 Environmental sound classification(ESC)technology mainly involves sound feature extraction and the selection of classifier algorithms.In order to explore the best feature extraction methods and classifier combinations,this article studies and analyzes the deep learning model PANNs-CNN,and compares different feature extraction methods through experiments.The experimental results show that compared with similar models,selecting pretrained and deeper CNN models can improve the predictive performance of ESC.Log-Mel features can better preserve high-dimensional features and feature correlations of sound signals,which helps improve the accuracy of model classification.The environmental sound classification algorithm based on Log-Mel feature extraction method and PANNs-CNN14 studied in the article has the best classification accuracy on the ESC-50 dataset,and its effectiveness has been verified in practical applications.

作者关志广 GUAN Zhiguang(Nanning Vocational and Technical University,Nanning 530008,China)

机构地区南宁职业技术大学

出处《无线互联科技》 2024年第16期12-15,共4页 Wireless Internet Science and Technology

基金广西教育科学“十四五”规划2023年度专项课题,项目名称:新工科背景下人工智能类专业专创融合实践教学研究,项目编号:2023ZJY1841。

关键词环境声音分类预训练音频神经网络卷积神经网络 Log-Mel MEL频率倒谱系数 ESC PANNs CNN Log-Mel Mel frequency cepstrum coefficient

分类号 TP3-05 [自动化与计算机技术—计算机科学与技术]

引文网络
相关文献

1朱芳慧.融合注意力机制的人机交互儿童情感识别技术研究[J].自动化与仪器仪表,2024(7):251-255.
2吕丹璇,吕家威.Python在声音特征提取与分类中的实现方法研究[J].电声技术,2024,48(4):20-22.
3李羽蒙,樊红.基于MFCC特征与卷积神经网络的托辊故障诊断方法[J].武汉大学学报（工学版）,2024,57(5):691-698.
4何婉婷,林琴韵,杨旭东,严洪立,徐攀,杨朝阳,高跃明.融合多种语音特征参数的阈下抑郁风险预测[J].复旦学报（自然科学版）,2024,63(3):344-350.
5刘鹏.基于基音频率的数字化音乐情感分类方法[J].自动化技术与应用,2024,43(7):158-162.
6陈彦茹,肖思宇.基于声音识别技术的数字媒体图像三维重构方法[J].电声技术,2024,48(8):73-75.

无线互联科技

2024年第16期

浏览历史

内容加载中请稍等...

基于PANNs-CNN的环境声音分类算法研究及应用

相关作者

相关机构

相关主题

浏览历史