期刊文献+

基于深度学习的视频人体动作识别综述 被引量:3

A survey of video human action recognition based on deep learning
下载PDF
导出
摘要 随着网络多媒体技术的快速发展和视频采集设备的不断完善,越来越多的视频被共享到网络平台,视频逐渐占据了人类生活,因此视频理解已成为计算机视觉研究的热点之一。作为视频理解的首要任务,对动作识别的研究具有重要的意义。目前基于深度学习的二维图像识别分类方法已经取得了较大的进展,但是视频动作识别仍面临着巨大挑战。其原因在于视频和二维图像相差一个时间维度,对视频中行走、跑步、跳高和跳远等动作的理解不仅需要二维图像所具有的空间语义信息,还需要时序信息。因此,如何利用视频的时序信息对动作识别非常重要。首先介绍了动作识别的研究背景以及发展过程,分析了当前视频动作识别所面临的挑战,然后详细介绍了时序建模及参数优化的方法,分析了常用的动作识别数据集和度量参数,最后对未来的研究方向进行了展望。 With the rapid advancement of network multimedia technology and the continuous improvement of video capture equipment,an increasing number of videos are shared on network platforms,gradually becoming an integral part of human life.Consequently,video understanding has become one of the hot spots of computer vision research,with video understanding being a pivotal task.At present,2D image recognition classification methods based on deep learning have made significant strides.However,video action recognition still faces a formidable challenge.The reason is that videos differ from 2D images by an additional temporal dimension,and that understanding actions such as walking,running,high jumping,and long jumping in videos requires not only the spatial semantic information that 2D images possess but also temporal information.Therefore,effectively utilizing the temporal information of videos is critical for action recognition.This paper firstly introduced the research background and development process of action recognition,followed by an analysis of the current challenges in video action recognition.The methods of temporal modeling and parameter optimization were then presented in detail,along with an examination of the commonly used action recognition datasets and metric parameters.Finally,the paper outlined the future research directions in this field.
作者 毕春艳 刘越 BI Chun-yan;LIU Yue(Beijing Mixed Reality and New Display Engineering Technology Research Center,Beijing 100081,China;School of Optics and Photonics,Beijing Institute of Technology,Beijing 100081,China)
出处 《图学学报》 CSCD 北大核心 2023年第4期625-639,共15页 Journal of Graphics
基金 国家自然科学基金项目(61960206007) 高等学校学科创新引智计划项目(B18005)。
关键词 动作识别 视频理解 深度学习 卷积神经网络 计算机视觉 action recognition video understanding deep learning convolutional neural network computer vision
  • 相关文献

参考文献10

二级参考文献270

  • 1刘相滨,向坚持,王胜春.人行为识别与理解研究探讨[J].计算机与现代化,2004(12):1-5. 被引量:12
  • 2魏志强,纪筱鹏,冯业伟.基于自适应背景图像更新的运动目标检测方法[J].电子学报,2005,33(12):2261-2264. 被引量:54
  • 3杜友田,陈峰,徐文立,李永彬.基于视觉的人的运动识别综述[J].电子学报,2007,35(1):84-90. 被引量:79
  • 4谢林海,刘相滨.基于不变矩特征和神经网络的步态识别[J].微计算机信息,2007,23(19):279-281. 被引量:9
  • 5Oliver N,Horvitz E.A comparison of HMMs and dynamic Bayesian networks for recognizing office activities[J].Lecture Notes in Artificial Intelligence,2005,3538:199-209.
  • 6Kolonias I,Christmas W,Kittler J.Use of context in automatic annotation of sports videos[J].Lecture Notes in Computer Science,2004,3287:1-12.
  • 7Park S,Aggarwal J K.A hierarchical Bayesian network for event recognition of human actions and interactions[J].Multimedia Systems,2004,10(2):164-179.
  • 8Lafferty J,Mccallum A,Pereira F.Conditional random fields:probabilistic models for segmenting and labeling sequence data[A].In Proc ICML[C].Massachusetts:IEEE press,2001,282-289.
  • 9Sminchisescu C,Kanaujia A,Li Z,Metaxas D.Conditional models for contextual human motion recognition[A].In Proc ICCV[C].Beijing:IEEE Computer Society Press,2005.2:1808-1815.
  • 10Luhr S,Bui H H,Venkatesh S,West G A W.Recognition of Human Activity through Hierarchical Stochastic Learning[A].In Proc.PerCom[C].Texas:IEEE Computer Society Press,2003.416-422.

共引文献436

同被引文献33

引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部