摘要
为解决手姿态估计中标签数据的获取困难问题,该文提出了一种基于多视图投影的半监督学习方法,减少对标记数据的需求。首先,从单张深度图中分割出手部区域,将其投影至3个正交平面;而后,采用编解码模型学习两个投影视图在低维度隐空间中的关联表征;最终,结合标记数据,学习低维度隐空间表征到手姿态三维坐标的回归映射。实验表明,该方法减少了对标记数据的依赖,在NYU手姿态估计数据库上获得了较好的结果。
For hand pose estimation,one immediate problem is to reduce the need for labeled data which is difficult to provide in desired quantity,realism and accuracy.To meet this need,a novel multi-view projection based semi-supervised learning algorithm is proposed.Firstly,3D hand points are extracted from a single depth image without label and projected onto three orthogonal planes.Secondly,an encoder-decoder model is applied to learn the latent representation of two projections.Finally,small amount of labeled data is used to learn a mapping from latent representation to hand joint coordinates.The propose algorithm is evaluated on NYU hand pose estimation dataset,and the experimental results demonstrate the effectiveness and advantages of our proposed algorithm.
作者
况逸群
程洪
崔芳
KUANG Yi-qun;CHENG Hong;CUI Fang(Center for Robotics,University of Electronic Science and Technology of China,Chengdu 611731)
出处
《电子科技大学学报》
EI
CAS
CSCD
北大核心
2019年第5期747-753,共7页
Journal of University of Electronic Science and Technology of China
关键词
深度图
手姿态估计
多视角
半监督学习
depth image
hand pose estimation
multi-view
semi-supervise learning