期刊文献+

基于迁移学习的暴恐音频判别方法

Discrimination Method of Terrorism Audio Based on Transfer Learning
下载PDF
导出
摘要 本文从网络和电影中截取暴恐音频片段组成暴恐音频库,由于暴恐音频来源受限,而卷积神经网络需要大量的数据训练,为此,将迁移学习技术引入暴恐音频的判别中.首先采用公开的TUT音频数据集进行预训练,然后保留模型权重并迁移网络在暴恐音频库上继续训练,最后在fine-tune后的网络中增加网络的层数,添加了一种类似于残差网络的结构使其能够利用更多的音频信息.实验结果表明,使用迁移学习方法比未使用迁移学习方法的平均判别率提升了3.97%,有效解决了在暴恐音频判别研究中音频数据集过小而带来的训练问题,且改进后的迁移学习网络进一步提升了1.01%的平均判别率,最终达到96.97%的判别率. This article intercepts the horror audio clips from the network and movies to build terrorism audio dataset.However, the source of the horror audio is limited, whereas the convolutional neural network depends on a large amount of data. To this end, the transfer learning technology is performed in the discrimination of the terrorism audio. Firstly, pretrain the network by using the public TUT acoustic scenes dataset, and then retain the model weight and transfer the neural network to the discrimination of terrorism audio. Finally, add more layers after the fine-tune network to utilize more audio information, the structure of the added layers is similar to the residual network. The experimental results indicate that the average discriminant rate of the transfer learning method is 3.97% higher than that of the non-transfer learning method, which effectively solves the training problem caused by small audio dataset in the study of terrorism audio discrimination, and the average discriminant rate of the improved transfer learning network has increased by 1.01%,finally reaches the discriminant rate of 96.97%.
作者 胡鑫旭 周欣 何小海 熊淑华 王正勇 HU Xin-Xu;ZHOU Xin;HE Xiao-Hai;XIONG Shu-Hua;WANG Zheng-Yong(College of Electronics and Information Engineering,Sichuan University,Chengdu 610065,China)
出处 《计算机系统应用》 2019年第11期147-152,共6页 Computer Systems & Applications
基金 国家自然科学基金(61871278) 成都市产业集群协同创新项目(2016-XT00-00015-GX) 四川省科技计划项目(2018HH0143) 四川省教育厅科研项目(18ZB0355)~~
关键词 暴恐音频判别 迁移学习 卷积神经网络 深度学习 残差网络 discrimination of terrorism audio transfer learning convolutional neural network deep learning residual network
  • 相关文献

参考文献5

二级参考文献18

  • 1张丽梅,乔立山,陈松灿.基于张量模式的特征提取及分类器设计综述[J].山东大学学报(工学版),2009,39(1):6-14. 被引量:5
  • 2GIANNAKOPOULOS T, PIKRAKIS A, THEODORIDIS S. A multi- class audio classification method with respect to violent content in movies using Bayesian Networks [ C ]//Multimedia Signal Processing . mmsp .ieee 9th Workshop on. Crete: IEEE, 2007: 90-93.
  • 3KOLDA T G, BADER B W. Tensor decompositions and applications. [J]. Siam Review, 2009, 51(3) : 455-500.
  • 4FABER N, BRO R, HOPKE P K. Recent developments in CAN- DECOMP/PARAFAC algorithms: a critical review[J]. Chemometrics & Intelligent Laboratory Systems, 2003, 65 ( 1 ) : 119-137.
  • 5CICHOCKI A, MANDIC D, PHAN A H, et al. Tensor decompo- sitions for signal processing applications: From two-way to muhiway component analysis[J]. Signal Processing Magazine IEEE, 2015, 32 (2) : 145-163.
  • 6BENETOS, KOTROPOULOS E. Non-negative tensor factorization ap- plied to music genre classification [J]. Audio, Speech, and Language Processing, IEEE Transactions'on, 2010, 18(8) : 1955-1967.
  • 7EYBEN F, WENINGER F, LEHMENT N, et al. Affective video retrieval: violence detection in Hollywood movies by large-scale segmental feature extraction[J]. Plos One, 2013, 8(12) : e78506.
  • 8赵军.命名实体识别、排歧和跨语言关联[J].中文信息学报,2009,23(2):3-17. 被引量:50
  • 9史荧中,王士同,蒋亦樟,刘培林.迁移学习支持向量回归机[J].计算机应用,2013,33(11):3084-3089. 被引量:5
  • 10杨立东,王晶,谢湘,匡镜明.基于张量分解模型的语音信号特征提取方法[J].北京理工大学学报,2013,33(11):1171-1175. 被引量:7

共引文献71

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部