期刊文献+

一种基于动作引导掩蔽策略的动作识别模型

An Action Recognition Model Based on Action Guided Masking Strategy
下载PDF
导出
摘要 视频级别的动作分析通常涉及整个输入视频序列的处理,往往会导致显著的时间和空间冗余。为了减少时间和空间冗余,提高视频动作识别的准确性,提出一个动作引导掩码动作识别模型,使用高掩码率以确保重建难度,并减少图形处理器(Graphics Processing Unit,GPU)计算。提出一种自适应掩码采样器,利用辅助网络根据先前计算的重建分支损失动态采样可见令牌进行控制,从而增强模型的健壮性。经观察,即使在完成模型训练后,获得的特征仍然没有完全用于分类。为解决这个问题,提出一个辅助分类器。在Kinetics-400和Something-Something v2数据集上进行实验,分别基于vit-base骨干网络实现了81.9%和71.7%的准确率。 Video-level motion analysis often involves the processing of the entire input video sequence,often resulting in significant time and space redundancy.In order to reduce the time and space redundancy and improve the accuracy of video motion recognition,this paper proposes a motion guided mask motion recognition model,which uses high mask rate to ensure reconstruction difficulty and reduce Graphics Processing Unit(GPU)calculation.In this paper,we propose an adaptive mask sampler that uses an auxiliary network to dynamically sample visible tokens based on previously calculated reconstructed branch losses to enhance the robustness of the model.It has been observed that even after model training,the features obtained are still not fully used for classification.To solve this problem,an auxiliary classifier is proposed.Experiments on Kinetics 400 and Something-Something v2 datasets achieved accuracy rates of 81.9%and 71.7%,respectively,based on vit-base backbone networks.
作者 谢慧志 王向阳 裴涛 XIE Huizhi;WANG Xiangyang;PEI Tao(College of Communication and Information Engineering,Shanghai University,Shanghai 200444,China)
出处 《电视技术》 2024年第8期35-38,45,共5页 Video Engineering
关键词 自监督学习 掩码策略 动作识别 self-superivised learning mask strategy action recognition
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部