期刊文献+

基于双二流卷积和多特征融合的D-S声音分类 被引量:2

D-S sound classification based on double two stream convolution and multi-feature fusion
下载PDF
导出
摘要 针对现有模型声音分类精度不足的问题,提出了一种基于多特征双二流网络的D-S融合模型。首先,提出了四种组合特征来更全面有效地表征声音。其次,提出双二流网络结构来更好地训练模型。第一和二流网络采用多分辨率多通道特征送入二阶密集卷积网络(2-DenseNet),其中2-DenseNet被分成了两个密集块。第三和四流网络采用单分辨率单通道的特征拼接送入四层CNN。然后利用D-S证据理论对softmax层的输出结果进行融合,得到D-S-Net模型。实验结果表明,基于UrbanSound8k数据集,经数据增强后该模型的准确率达96.36%,较基线提高了25.34%,并验证了在噪声环境下的鲁棒性,在20 dB信噪比下具有90.34%的识别率,在低信噪比下的性能得到了很好的提升。 In order to solve the problem of insufficient accuracy of sound classification,this paper proposed a Dempster-Shafer(D-S)fusion model based on multi-feature double two stream network.Firstly,this paper proposed four combined features to represent sound more comprehensively and effectively.Secondly,this paper proposed a better training model based on double two stream network architecture.By using multi-resolution and multi-channel features,the first and second stream network feed into second-order dense convolution network(2-DenseNet),in which 2-DenseNet divided into two dense blocks.By using the feature splicing of single resolution and single channel,the third and fourth stream networks fed into the four-layer CNN network.Then it fused output results of softmax based on D-S evidence theory to obtain the D-S-Net model.The experimental results show that based on the UrbanSound8 k data set,the accuracy of the model is 96.34%after data enhancement,which is 25.34%higher than the baseline,which verifies the robustness in noise environment.It has a recognition rate of 90.34%at 20 dB signal to noise ratio(SNR),the performance is greatly improved at low SNR.
作者 吴佳赛 高振斌 Wu Jiasai;Gao Zhenbin(School of Electronic Information Engineering,Hebei University of Technology,Tianjin 300401,China)
出处 《计算机应用研究》 CSCD 北大核心 2022年第3期693-698,703,共7页 Application Research of Computers
关键词 声音分类 特征融合 密集卷积网络 D-S融合 双二流网络 sound classification feature fusion dense convolution network D-S fusion double two stream network
  • 相关文献

参考文献8

二级参考文献36

  • 1李业良,张二华,唐振民.基于混合式注意力机制的语音识别研究[J].计算机应用研究,2020,37(1):131-134. 被引量:9
  • 2赵金山,狄增如,王大辉.北京市公共汽车交通网络几何性质的实证研究[J].复杂系统与复杂性科学,2005,2(2):45-48. 被引量:45
  • 3陆化普,石冶.Complexity of Public Transport Networks[J].Tsinghua Science and Technology,2007,12(2):204-213. 被引量:13
  • 4李英,周伟,郭世进.上海公共交通网络复杂性分析[J].系统工程,2007,25(1):38-41. 被引量:65
  • 5Kiyoshi T, Kenichi I.Analysis of GMM by a Gaussian wavelet transform[J].Procedia Computer Science, 2012, 8 ( 1 ) : 467-472.
  • 6Selami S, Bilginer M.Common vector approach and its combination with GMM for text-independent speaker recog- nition[J].Expert Systems with Applications,2011,38(9): 11394-11400.
  • 7Liao Yi-ching, Wu Chien-min.Fast k-nearest neighbors search using modified principal axis search tree[J].Digital Signal Processing, 2010,20 (5) : 1494-1501.
  • 8Jun T, Mineichi K.Probably correct k-nearest neighbor search in high dimensions[J].Pattern Recognition, 2010, 43(4) 1361-1372.
  • 9Wang Shuiping, Tang Zhenming.Design and implementa- tion of an audio classification system based on SVM[J]. Procedia Engineering, 2011,15 ( 1 ) : 4031-4035.
  • 10Manikanda J, Venkataramani B.Design of a real time automatic speech recognition system using modified one against all SVM classifier[J].Microprocessors and Micro- systems, 2011,35 (6) : 568-578.

共引文献261

同被引文献21

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部