Virtual reality,augmented reality,robotics,and autonomous driving,have recently attracted much attention from both academic and industrial communities,in which image-based camera localization is a key task.However,the...Virtual reality,augmented reality,robotics,and autonomous driving,have recently attracted much attention from both academic and industrial communities,in which image-based camera localization is a key task.However,there has not been a complete review on image-based camera localization.It is urgent to map this topic to enable individuals enter the field quickly.In this paper,an overview of image-based camera localization is presented.A new and complete classification of image-based camera localization approaches is provided and the related techniques are introduced.Trends for future development are also discussed.This will be useful not only to researchers,but also to engineers and other individuals interested in this field.展开更多
Background Generally, it is difficult to obtain accurate pose and depth for a non-rigid moving object from a single RGB camera to create augmented reality (AR). In this study, we build an augmented reality system from...Background Generally, it is difficult to obtain accurate pose and depth for a non-rigid moving object from a single RGB camera to create augmented reality (AR). In this study, we build an augmented reality system from a single RGB camera for a non-rigid moving human by accurately computing pose and depth, for which two key tasks are segmentation and monocular Simultaneous Localization and Mapping (SLAM). Most existing monocular SLAM systems are designed for static scenes, while in this AR system, the human body is always moving and non-rigid. Methods In order to make the SLAM system suitable for a moving human, we first segment the rigid part of the human in each frame. A segmented moving body part can be regarded as a static object, and the relative motions between each moving body part and the camera can be considered the motion of the camera. Typical SLAM systems designed for static scenes can then be applied. In the segmentation step of this AR system, we first employ the proposed BowtieNet, which adds the atrous spatial pyramid pooling (ASPP) of DeepLab between the encoder and decoder of SegNet to segment the human in the original frame, and then we use color information to extract the face from the segmented human area. Results Based on the human segmentation results and a monocular SLAM, this system can change the video background and add a virtual object to humans. Conclusions The experiments on the human image segmentation datasets show that BowtieNet obtains state-of-the-art human image segmentation performance and enough speed for real-time segmentation. The experiments on videos show that the proposed AR system can robustly add a virtual object to humans and can accurately change the video background.展开更多
基金supported by the National Natural Science Foundation of China under Grant No.61421004,61572499,61632003.
文摘Virtual reality,augmented reality,robotics,and autonomous driving,have recently attracted much attention from both academic and industrial communities,in which image-based camera localization is a key task.However,there has not been a complete review on image-based camera localization.It is urgent to map this topic to enable individuals enter the field quickly.In this paper,an overview of image-based camera localization is presented.A new and complete classification of image-based camera localization approaches is provided and the related techniques are introduced.Trends for future development are also discussed.This will be useful not only to researchers,but also to engineers and other individuals interested in this field.
文摘Background Generally, it is difficult to obtain accurate pose and depth for a non-rigid moving object from a single RGB camera to create augmented reality (AR). In this study, we build an augmented reality system from a single RGB camera for a non-rigid moving human by accurately computing pose and depth, for which two key tasks are segmentation and monocular Simultaneous Localization and Mapping (SLAM). Most existing monocular SLAM systems are designed for static scenes, while in this AR system, the human body is always moving and non-rigid. Methods In order to make the SLAM system suitable for a moving human, we first segment the rigid part of the human in each frame. A segmented moving body part can be regarded as a static object, and the relative motions between each moving body part and the camera can be considered the motion of the camera. Typical SLAM systems designed for static scenes can then be applied. In the segmentation step of this AR system, we first employ the proposed BowtieNet, which adds the atrous spatial pyramid pooling (ASPP) of DeepLab between the encoder and decoder of SegNet to segment the human in the original frame, and then we use color information to extract the face from the segmented human area. Results Based on the human segmentation results and a monocular SLAM, this system can change the video background and add a virtual object to humans. Conclusions The experiments on the human image segmentation datasets show that BowtieNet obtains state-of-the-art human image segmentation performance and enough speed for real-time segmentation. The experiments on videos show that the proposed AR system can robustly add a virtual object to humans and can accurately change the video background.