期刊文献+

基于多分辨率时频特征融合的声学场景分类 被引量:3

Acoustic scene classification based on multi-resolution time-frequency feature fusion
下载PDF
导出
摘要 声学场景分类是计算机听觉中最难的任务之一,在单一特征条件下采用基本的卷积神经网络相对于传统的分类方法精度已经有所提升,但是效果依然不够理想。针对这一问题,在卷积神经网络框架下,提出了一种基于时频特征融合的声学场景分类方案。在分类模型构建方面,提出一种多分辨率卷积池化方案,构造多分辨率卷积神经网络,以更好地适应提取特征的时频结构;在特征选取方面,融合低层次包络特征对数——Mel子带能量和高层次结构特征——非负矩阵分解系数矩阵,把两种二维特征堆叠为三维特征送入分类模型。在2017年和2018年声学场景分类和事件检测挑战赛的开发数据集上进行了训练和测试。实验结果表明,文中提出方案比基线系统的分类精度分别提高7.5%和10.3%,可有效改善分类效果。 Acoustic scene classification is one of the most difficult tasks in computer hearing. It is difficult to achieve good classification performance by using basic convolutional neural network structure under the condition of single feature. To solve this problem, this paper proposes an acoustic scene classification scheme based on time-frequency feature fusion and multi-resolution convolutional neural network. In the model design, a multi-resolution pooling scheme is adopted to construct a multi-resolution convolutional neural network, which can better adapt to the time-frequency structure of feature extraction. In the feature extraction, the Log Mel-band energies of low level envelope features and the non-negative matrix decomposition coefficient matrix of high level structure features are fused into three dimensional features to input the classification model. Training and testing are carried out on the development data sets of the acoustic scene classification and event detection challenge in 2017 and 2018. The experimental results show that the classification accuracy of the proposed scheme is 7.5% and 10.3% higher than the classification accuracy of the baseline system respectively.
作者 姚琨 杨吉斌 张雄伟 郑昌艳 孙蒙 YAO Kun;YANG Jibin;ZHANG Xiongwei;ZHENG Changyan;SUN Meng(Army Engineering University,Nanjing 210007,Jiangsu,China)
机构地区 陆军工程大学
出处 《声学技术》 CSCD 北大核心 2020年第4期494-500,共7页 Technical Acoustics
基金 国家自然科学基金(61471394) 江苏省优秀青年基金(BK20180080)资助项目。
关键词 声学场景分类 多分辨率卷积神经网络 时频特征融合 时频结构 非负矩阵分解 acoustic scene classification multi-resolution convolutional neural network time-frequency feature fusion time-frequency structure non-negative matrix factorization
  • 相关文献

同被引文献17

引证文献3

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部