期刊文献+

视频动作定位中密集特征金字塔主干网络 被引量:1

Dense Feature Pyramid Backbone Network in Video Action Location
下载PDF
导出
摘要 针对视频动作定位算法中金字塔层数增加时间分辨率降低,导致细节特征不完整,进而影响预测结果不准确的问题,提出密集连接型特征金字塔主干网络。视频图像输入特征金字塔主干网络中,密集连接金字塔提取帧级特征和层级特征,实现特征提取阶段参考层、基础层特征与深层特征联系;帧级特征和层级特征通过预测阶段、动作起止时间及标签信息;预测阶段输出融合光流信息输出、动作起止时间及标签预测结果。在THUMOS14数据集的检测结果与AFSD相比,平均精度均值(mAP)提高0.4%,准确定位动作在视频中的起止时间和类别,可应用于智能监控等场景。 The number of pyramid layers increases and the time resolution decreases in the video action location algorithm,resulting in incomplete detail features,which affects the accuracy of prediction results.Aiming at the problem,this paper proposes a densely connected feature pyramid backbone network.The video image is input into the backbone network of the feature pyramid,and the pyramid is densely connected to extract the frame level features and hierarchical features,so as to realize the connection between the reference layer,foundation layer features and deep features in the feature extraction stage.Frame level and hierarchical features output start and end time of actions and label information in the prediction stage,and the fused optical flow information and label prediction results are also output.On the THUMOS14 dataset,compared with AFSD,the mean average precision(mAP)is improved by 0.4%.It can accurately locate the start and end time and category of actions in the video,and can be applied to intelligent monitoring and other scenes.
作者 佟明蔚 毛琳 杨大伟 TONG Ming-wei;MAO Lin;YANG Da-wei(School of Electromechanical Engineering,Dalian Minzu University,Dalian Liaoning 116605,China)
出处 《大连民族大学学报》 2022年第5期412-417,共6页 Journal of Dalian Minzu University
基金 国家自然科学基金项目(61673084) 辽宁省自然科学基金项目(20170540192,20180550866,2020-MZLH-24)。
关键词 时序动作定位 密集连接 特征金字塔 特征融合 temporal action localization dense connection feature pyramid feature fusion
  • 相关文献

参考文献2

二级参考文献9

  • 1Kishore K. Reddy,Mubarak Shah.Recognizing 50 human action categories of web videos[J].Machine Vision and Applications.2013(5)
  • 2Chris Ellis,Syed Zain Masood,Marshall F. Tappen,Joseph J. LaViola,Rahul Sukthankar.Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition[J].International Journal of Computer Vision.2013(3)
  • 3Rongrong Ji,Hongxun Yao,Xiaoshuai Sun.Actor-independent action search using spatiotemporal vocabulary with appearance hashing[J].Pattern Recognition.2010(3)
  • 4Juan Carlos Niebles,Hongcheng Wang,Li Fei-Fei.Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words[J].International Journal of Computer Vision.2008(3)
  • 5Ivan Laptev.On Space-Time Interest Points[J].International Journal of Computer Vision (-).2005(2-3)
  • 6黎洪松,李达.人体运动分析研究的若干新进展[J].模式识别与人工智能,2009,22(1):70-78. 被引量:38
  • 7詹毅,李声杰,李梦.图像插值的自适应邻域滤波方法[J].计算机工程,2015,41(2):224-227. 被引量:3
  • 8张冬明,靳国庆,代锋,袁庆升,包秀国,张勇东.基于深度融合的显著性目标检测算法[J].计算机学报,2019,42(9):2076-2086. 被引量:34
  • 9张思宇,张轶.基于多尺度特征融合的小目标行人检测[J].计算机工程与科学,2019,41(9):1627-1634. 被引量:16

共引文献130

同被引文献1

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部