期刊文献+

基于时空注意力金字塔卷积的动作识别

Action recognition based on spatial-temporal attention pyramid convolution
下载PDF
导出
摘要 动作识别算法需要从视频中提取空间和时域特征,对计算存储资源要求较高。基于2D CNN的网络更为轻量,但从视频中提取时域特征的能力较弱,动作识别性能通常受到限制。S-TPNet提出时空金字塔模块以获取图像序列的时间粒度特征,有效提升了基于2D CNN的动作识别网络的性能。基于S-TPNet,设计了时空注意力模型以凸显空间和时间上的重要特征。为降低输入数据量,通常抽取局部视频帧作为输入,为降低采样帧与整体视频之间的不稳定差异,设计了自适应等间隔采样策略。实验表明,在未预训练的情况下,本网络在UCF-101和HMDB-51数据集上分别将Top-1精度提高了5.1%和3.3%,并且不会大幅增加所需参数。 Action recognition algorithms need to extract spatial and temporal features from video,which requires high computing and storage resources.The network based on 2D CNN is lighter,but the ability to extract time-domain features from video is weak,and the performance of action recognition is usually limited.S-TPNet proposes a spatial-temporal pyramid module to obtain the time granularity features of image sequences,which effectively improves the performance of the action recognition network based on 2D CNN.Based on S-TPNet,this paper designs a spatial-temporal attention model to highlight the important features of space and time.In order to reduce the amount of input data,local video frames are usually extracted as input.In order to reduce the unstable difference between the sampled frames and the overall video,this paper designs an adaptive equal interval sampling strategy.The experiment shows that without pre training,the network improves Top-1 accuracy by 5.1%and 3.3%on UCF-101 and HMDB-51 datasets,respectively,and does not significantly increase the required parameters.
作者 冯雨威 吴丽君 Feng Yuwei;Wu Lijun(College of Physics and Information Engineering,Fuzhou University,Fuzhou 350108,China)
出处 《网络安全与数据治理》 2023年第2期76-82,88,共8页 CYBER SECURITY AND DATA GOVERNANCE
关键词 时空注意力 动作识别 自适应采样 2D CNN 时空金字塔 spatial-temporal attention action recognition adaptive sampling 2D CNN spatial-temporal pyramid
  • 相关文献

参考文献1

二级参考文献3

共引文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部