摘要
鉴于人机交互(Human-Computer Interaction,HCI)系统和虚拟现实(Virtual Reality,VR)系统的应用需要,三维手势跟踪领域的理论和方法研究已成为国内外广泛关注的热点课题之一。近年来,基于计算机视觉的三维手势跟踪算法迅速发展,其中成本较低且较为普遍的单目RGB相机最具潜力,它是三维手势跟踪应用得以照进现实的重要工具和途径,成为了研究者们的研究重点。为了解在此基础上的手势跟踪算法的发展现状,辅助该领域的研究者们进行更深入的探索,文中首先在与传统方法的对比中引出了基于单目RGB图像的三维手势跟踪算法,并将其分为判别法、生成法和混合法3类,概括了其相应的优缺点;其次讨论了RGB图像特点对三维手势跟踪的影响,并归纳了缓解图像深度模糊性的方法;然后根据分类着重分析了以RGB为输入数据的代表性算法,通过可视化的性能评估指标比较了相关算法的具体优势和不足;最后总结了当前三维手势跟踪算法面临的难题并对未来的发展进行了展望。
In view of the needs of applications such as human-computer interaction(HCI)systems and virtual reality(VR)systems,the study on theories and methods of 3D gesture tracking has become one of the hot issues with widespread concern at home and abroad.In recent years,the 3D gesture tracking algorithms based on computer vision develop rapidly.Among them,the more economical and ubiquitous monocular RGB camera has the most potential.It is an important tool and way for 3D gesture tracking applications to take into reality,which has been focused by researchers.In order to comprehend the development status of gesture tracking algorithms,and assist researchers in this field to conduct more deep-going explorations,firstly,in comparison with the traditional methods,this paper introduces the 3D gesture tracking algorithms based on monocular RGB image,and divides it into three categories:discriminative methods,generative methods and hybrid methods,and summarizes the corresponding advantages and disadvantages.Secondly,the influence of RGB image characteristics on 3D gesture tracking is discussed,and the methods to alleviate the depth ambiguity of the image are generalized.Thirdly,according to the classification,the representative algorithms with RGB as input data are emphatically analyzed,and the specific superiority and weaknesses of related algorithms are compared through visualized performance evaluation index.Finally,the problems faced with the current 3D gesture tracking algorithms are summarized and the future development is prospected.
作者
张继凯
李琦
王月明
吕晓琪
ZHANG Ji-kai;LI Qi;WANG Yue-ming;LYU Xiao-qi(School of Information Engineering,Inner Mongolia University of Science&Technology,Baotou,Inner Mongolia 014010,China;Inner Mongolia Key Laboratory of Pattern Recognition and Intelligent Image Processing,Baotou,Inner Mongolia 014010,China;School of Information Engineering,Inner Mongolia University of Technology,Hohhot 010051,China)
出处
《计算机科学》
CSCD
北大核心
2022年第4期174-187,共14页
Computer Science
基金
国家自然科学基金(61771266)
内蒙古自治区自然科学基金(2019BS06005)
内蒙古自治区高等学校科学研究项目(NJZY20095)
内蒙古自治区科技计划项目(2019GG138)。
关键词
人机交互
虚拟现实
RGB图像
三维手势跟踪
计算机视觉
Human-Computer interaction
Virtual reality
RGB image
3D gesture tracking
Computer vision