摘要
环境声音分类(ESC)技术主要涉及声音特征提取和分类器算法的选择。为了探索最佳的特征提取方法和分类器组合,文章对深度学习模型PANNs-CNN进行了研究和分析,对不同的特征提取方法进行了实验对比。实验结果表明,在与同类模型对比中,选用预训练且更深层的CNN模型可以提高ESC的预测性能;Log-Mel特征可以更好地保留声音信号高维度特征及特征相关性,有助于提升模型分类准确率。文章研究的基于Log-Mel特征提取方式和PANNs-CNN 14的环境声音分类算法在ESC-50数据集上的分类准确率最好,并且在实际应用中验证了该算法的有效性。
Environmental sound classification(ESC)technology mainly involves sound feature extraction and the selection of classifier algorithms.In order to explore the best feature extraction methods and classifier combinations,this article studies and analyzes the deep learning model PANNs-CNN,and compares different feature extraction methods through experiments.The experimental results show that compared with similar models,selecting pretrained and deeper CNN models can improve the predictive performance of ESC.Log-Mel features can better preserve high-dimensional features and feature correlations of sound signals,which helps improve the accuracy of model classification.The environmental sound classification algorithm based on Log-Mel feature extraction method and PANNs-CNN14 studied in the article has the best classification accuracy on the ESC-50 dataset,and its effectiveness has been verified in practical applications.
作者
关志广
GUAN Zhiguang(Nanning Vocational and Technical University,Nanning 530008,China)
出处
《无线互联科技》
2024年第16期12-15,共4页
Wireless Internet Science and Technology
基金
广西教育科学“十四五”规划2023年度专项课题,项目名称:新工科背景下人工智能类专业专创融合实践教学研究,项目编号:2023ZJY1841。