期刊文献+

基于平均教师模型的弱标记半监督声音事件检测

Weakly Labeled Semi-supervised Sound Event Detection Based on Mean Teacher Model
下载PDF
导出
摘要 为了利用大量不平衡和未标记数据,采用一致性正则化思想的平均教师模型用于弱标记半监督声音事件检测,可有效减少半监督学习中的过拟合问题.在教师模型的权重更新过程中,首次提出将随机加权平均算法(SWA)用于声音事件检测,可以加快预测速度并且节约成本.针对模型的架构问题,采用改进的门控卷积长短时记忆网络(GCLSTM)作为学生模型和教师模型,其中全局加权秩池化层可以克服平均池化和最大池化对声音事件的低估和高估的限制,有效地提高系统的性能.在对数据进行特征提取过程中,采用SpecAugment策略对语谱图进行增强,从而有效地解决过拟合问题.为了评估实验方法,在声学场景和事件的检测及分类(DCASE)2018挑战任务4数据集上进行测试,结果表明:评估集的平均F1分数可达24.9%,明显优于基线系统和其他方法的F1分数. In order to take advantage of a large amount of unbalanced and unlabeled data,the mean teacher model of consistent regularization is proposed for weakly labeled semi-supervised sound event detection,which can effectively reduce the over-fitting problem in semi-supervised learning.In the process of updating the teacher model weights,it was first proposed to use the Stochastic Weight Averaging(SWA)algorithm for sound event detection,which can quickly predict speed and save costs.For the model architecture problem,an improved Gated Convolutional Long-Short-Term Memory(GCLSTM)network is adopted as the student model and the teacher model,which effectively improves the performance of the system.And using the Global Weighted Rank Pooling(GWRP)layer can effectively overcome the limitations of the average and max pooling on the underestimation and overestimation of sound events.During the feature extraction of the data,the SpecAugment strategy is used to enhance the spectrogram to solve the problem of overfitting.To evaluate our method,we test it on the dataset of task 4 of the Detection and Classification of Acoustic Scenes and Events(DCASE)2018 challenge.The results shows that the average F1 score of the evaluation set is 24.9%,which is significantly better than the F1 score of the baseline system.
作者 王金甲 杨倩 崔琳 纪绍男 WANG Jinjia;YANG Qian;CUI Lin;JI Shaonan(School of Information Science and Engineering(School of Software),Yanshan University,Qinhuangdao,Hebei 066004,China;Hebei Key Laboratory of Information Transmission and Signal Processing,Yanshan University,Qinhuangdao,Hebei 066004,China)
出处 《复旦学报(自然科学版)》 CAS CSCD 北大核心 2020年第5期540-550,共11页 Journal of Fudan University:Natural Science
基金 国家自然科学基金(61473339) 河北省青年拔尖人才支持计划([2013]17) 京津冀基础研究合作专项(F2019203583)。
关键词 声音事件检测 弱标记半监督 平均教师模型 随机加权平均 数据增强 sound event detection weakly labeled semi-supervised mean teacher stochastic weight averaging
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部