期刊文献+

基于注意力机制的弱监督动作定位方法 被引量:1

Weakly supervised action localization method based on attention mechanism
下载PDF
导出
摘要 针对弱监督动作定位方法无法直接进行动作定位且定位准确性不高的问题,提出了一种基于注意力机制的弱监督动作定位方法,并设计和实现了一种基于动作前后帧信息和区分函数的动作定位模型。采用条件变分自编码器(CVAE)注意力值生成模型,将生成的帧级注意力值作为伪帧级标签;为了增强帧前后的关联性,改进CVAE注意力值生成模型,加入动作前后帧信息以获取帧级注意力值;采用基于区分函数的注意力值优化模型,对伪帧级标签进行反复训练和优化。在THUMOS14和ActivityNet1.2数据集上进行的实验结果表明,基于动作前后帧信息和区分函数的动作定位模型具有较好的动作定位效果和准确性,相较于未加入动作前后帧信息的模型,动作漏检率减小了11.7%;与AutoLoc、W-TALC、3C-Net等弱监督动作定位模型对比,当交并比(IoU)取值0.5时,在THUMOS14数据集上平均检测精度均值(mAP)提升10.7%以上,在ActivityNet1.2数据集上mAP提升8.8%以上。 Aiming at the problem that weakly supervised action localization method cannot locate action directly and the localization accuracy is not high,a weakly supervised action localization method based on attention mechanism was proposed,and an action localization model based on the pre-frame and post-frame information of action frame and the distinguishing function was designed and realized.The attention value generation model of Conditional Variational AutoEncoder(CVAE)was used to generate frame-level attention values as pseudo-frame-level labels;which CAVE was improved to obtain the frame-level attention value by adding the pre-frame and post-frame information of the action frame;to train and optimize pseudo-frame-level labels repeatedly,the optimization model for attention value based on distinguishing function was used.The experimental results conducted on THUMOS14 and ActivityNet1.2 datasets show that the action localization model based on the pre-and post-frame information of the action frame and the distinguishing function has better action localization effect and accuracy,which missing detection rate reduced by 11.7% compared with the model without the pre-frame and post-frame information of action frame;compared with AutoLoc,Weakly-supervised Temporal Activity Localization and Classification framework(W-TALC),3C-Net and other weakly supervised action localization models,when Intersection over Union(IoU)value is set to 0.5,the mean Average Precision(mAP)value on THUMOS14 dataset is improved by more than 10.7%,and the mAP value on ActivityNet1.2 dataset is improved by more than 8.8%.
作者 胡聪 华钢 HU Cong;HUA Gang(College of Information and Control Engineering,China University of Mining and Technology,Xuzhou Jiangsu 221116,China)
出处 《计算机应用》 CSCD 北大核心 2022年第3期960-967,共8页 journal of Computer Applications
关键词 弱监督 注意力值 条件变分自编码器 区分函数 动作定位 平均检测精度均值 weakly supervised attention value Conditional Variational AutoEncoder(CVAE) distinguishing function action localization mean Average Precision(mAP)
  • 相关文献

参考文献2

二级参考文献7

共引文献1

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部