摘要
本文从网络和电影中截取暴恐音频片段组成暴恐音频库,由于暴恐音频来源受限,而卷积神经网络需要大量的数据训练,为此,将迁移学习技术引入暴恐音频的判别中.首先采用公开的TUT音频数据集进行预训练,然后保留模型权重并迁移网络在暴恐音频库上继续训练,最后在fine-tune后的网络中增加网络的层数,添加了一种类似于残差网络的结构使其能够利用更多的音频信息.实验结果表明,使用迁移学习方法比未使用迁移学习方法的平均判别率提升了3.97%,有效解决了在暴恐音频判别研究中音频数据集过小而带来的训练问题,且改进后的迁移学习网络进一步提升了1.01%的平均判别率,最终达到96.97%的判别率.
This article intercepts the horror audio clips from the network and movies to build terrorism audio dataset.However, the source of the horror audio is limited, whereas the convolutional neural network depends on a large amount of data. To this end, the transfer learning technology is performed in the discrimination of the terrorism audio. Firstly, pretrain the network by using the public TUT acoustic scenes dataset, and then retain the model weight and transfer the neural network to the discrimination of terrorism audio. Finally, add more layers after the fine-tune network to utilize more audio information, the structure of the added layers is similar to the residual network. The experimental results indicate that the average discriminant rate of the transfer learning method is 3.97% higher than that of the non-transfer learning method, which effectively solves the training problem caused by small audio dataset in the study of terrorism audio discrimination, and the average discriminant rate of the improved transfer learning network has increased by 1.01%,finally reaches the discriminant rate of 96.97%.
作者
胡鑫旭
周欣
何小海
熊淑华
王正勇
HU Xin-Xu;ZHOU Xin;HE Xiao-Hai;XIONG Shu-Hua;WANG Zheng-Yong(College of Electronics and Information Engineering,Sichuan University,Chengdu 610065,China)
出处
《计算机系统应用》
2019年第11期147-152,共6页
Computer Systems & Applications
基金
国家自然科学基金(61871278)
成都市产业集群协同创新项目(2016-XT00-00015-GX)
四川省科技计划项目(2018HH0143)
四川省教育厅科研项目(18ZB0355)~~
关键词
暴恐音频判别
迁移学习
卷积神经网络
深度学习
残差网络
discrimination of terrorism audio
transfer learning
convolutional neural network
deep learning
residual network