RGB-D camera is a new type of sensor,which can obtain the depth and texture information in an unknown 3D scene simultaneously,and they have been applied in various fields widely.In fact,when implementing such kinds of...RGB-D camera is a new type of sensor,which can obtain the depth and texture information in an unknown 3D scene simultaneously,and they have been applied in various fields widely.In fact,when implementing such kinds of applications using RGB-D camera,it is necessary to calibrate it first.To the best of our knowledge,at present,there is no existing a systemic summary related to RGB-D camera calibration methods.Therefore,a systemic review of RGB-D camera calibration is concluded as follows.Firstly,the mechanism of obtained measurement and the related principle of RGB-D camera calibration methods are presented.Subsequently,as some specific applications need to fuse depth and color information,the calibration methods of relative pose between depth camera and RGB camera are introduced in Section 2.Then the depth correction models within RGB-D cameras are summarized and compared respectively in Section 3.Thirdly,considering that the angle of the view field of RGB-D camera is smaller and limited to some specific applications,we discuss the calibration models of relative pose among multiple RGB-D cameras in Section 4.At last,the direction and trend of RGB-D camera calibration are prospected and concluded.展开更多
In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camer...In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camera motion of two consecutive RGB-D frames by minimizing the photometric error.To permit outliers and noise,a robust sensor model built upon the t-distribution and an error function mixing depth and photometric errors are used to enhance the accuracy and robustness.Local graph optimization based on key frames is used to reduce the accumulative error and refine the local map.The loop closure detection method,which combines the appearance similarity method and spatial location constraints method,increases the speed of detection.Experimental results demonstrate that the proposed approach achieves higher accuracy on the motion estimation and environment reconstruction compared to the other state-of-the-art methods. Moreover,the proposed approach works in real-time on a laptop without a GPU,which makes it attractive for robots equipped with limited computational resources.展开更多
Accurate vehicle localization is a key technology for autonomous driving tasks in indoor parking lots,such as automated valet parking.Additionally,infrastructure-based cooperative driving systems have become a means t...Accurate vehicle localization is a key technology for autonomous driving tasks in indoor parking lots,such as automated valet parking.Additionally,infrastructure-based cooperative driving systems have become a means to realizing intelligent driving.In this paper,we propose a novel and practical vehicle localization system using infrastructure-based RGB-D cameras for indoor parking lots.In the proposed system,we design a depth data preprocessing method with both simplicity and efficiency to reduce the computational burden resulting from a large amount of data.Meanwhile,the hardware synchronization for all cameras in the sensor network is not implemented owing to the disadvantage that it is extremely cumbersome and would significantly reduce the scalability of our system in mass deployments.Hence,to address the problem of data distortion accompanying vehicle motion,we propose a vehicle localization method by performing template point cloud registration in distributed depth data.Finally,a complete hardware system was built to verify the feasibility of our solution in a real-world environment.Experiments in an indoor parking lot demonstrated the effectiveness and accuracy of the proposed vehicle localization system,with a maximum root mean squared error of 5 cm at 15Hz compared with the ground truth.展开更多
针对室内场景图像语义分割结果不精确、显著图粗糙的问题,提出一种基于多模态特征优化提取和双路径引导解码的网络架构(feature regulator and dual-path guidance,FG-Net)。具体来说,设计的特征调节器对每个阶段的多模态特征依次进行...针对室内场景图像语义分割结果不精确、显著图粗糙的问题,提出一种基于多模态特征优化提取和双路径引导解码的网络架构(feature regulator and dual-path guidance,FG-Net)。具体来说,设计的特征调节器对每个阶段的多模态特征依次进行噪声过滤、重加权表示、差异性互补和交互融合,通过强化RGB和深度特征聚合,优化特征提取过程中的多模态特征表示。然后,在解码阶段引入特征交互融合后丰富的跨模态线索,进一步发挥多模态特征的优势。结合双路径协同引导结构,在解码阶段融合多尺度、多层次的特征信息,从而输出更细致的显著图。实验在公开数据集NYUD-v2和SUN RGB-D上进行,在主要评价指标mIoU上达到48.5%,优于其他先进算法。结果表明,该算法实现了更精细的室内场景图像语义分割,表现出了较好的泛化性和鲁棒性。展开更多
The widespread availability of digital multimedia data has led to a new challenge in digital forensics.Traditional source camera identification algorithms usually rely on various traces in the capturing process.Howeve...The widespread availability of digital multimedia data has led to a new challenge in digital forensics.Traditional source camera identification algorithms usually rely on various traces in the capturing process.However,these traces have become increasingly difficult to extract due to wide availability of various image processing algorithms.Convolutional Neural Networks(CNN)-based algorithms have demonstrated good discriminative capabilities for different brands and even different models of camera devices.However,their performances is not ideal in case of distinguishing between individual devices of the same model,because cameras of the same model typically use the same optical lens,image sensor,and image processing algorithms,that result in minimal overall differences.In this paper,we propose a camera forensics algorithm based on multi-scale feature fusion to address these issues.The proposed algorithm extracts different local features from feature maps of different scales and then fuses them to obtain a comprehensive feature representation.This representation is then fed into a subsequent camera fingerprint classification network.Building upon the Swin-T network,we utilize Transformer Blocks and Graph Convolutional Network(GCN)modules to fuse multi-scale features from different stages of the backbone network.Furthermore,we conduct experiments on established datasets to demonstrate the feasibility and effectiveness of the proposed approach.展开更多
An ultrafast framing camera with a pulse-dilation device,a microchannel plate(MCP)imager,and an electronic imaging system were reported.The camera achieved a temporal resolution of 10 ps by using a pulse-dilation devi...An ultrafast framing camera with a pulse-dilation device,a microchannel plate(MCP)imager,and an electronic imaging system were reported.The camera achieved a temporal resolution of 10 ps by using a pulse-dilation device and gated MCP imager,and a spatial resolution of 100μm by using an electronic imaging system comprising combined magnetic lenses.The spatial resolution characteristics of the camera were studied both theoretically and experimentally.The results showed that the camera with combined magnetic lenses reduced the field curvature and acquired a larger working area.A working area with a diameter of 53 mm was created by applying four magnetic lenses to the camera.Furthermore,the camera was used to detect the X-rays produced by the laser-targeting device.The diagnostic results indicated that the width of the X-ray pulse was approximately 18 ps.展开更多
This paper introduces an intelligent computational approach for extracting salient objects fromimages and estimatingtheir distance information with PTZ (Pan-Tilt-Zoom) cameras. PTZ cameras have found wide applications...This paper introduces an intelligent computational approach for extracting salient objects fromimages and estimatingtheir distance information with PTZ (Pan-Tilt-Zoom) cameras. PTZ cameras have found wide applications innumerous public places, serving various purposes such as public securitymanagement, natural disastermonitoring,and crisis alarms, particularly with the rapid development of Artificial Intelligence and global infrastructuralprojects. In this paper, we combine Gauss optical principles with the PTZ camera’s capabilities of horizontal andpitch rotation, as well as optical zoom, to estimate the distance of the object.We present a novel monocular objectdistance estimation model based on the Focal Length-Target Pixel Size (FLTPS) relationship, achieving an accuracyrate of over 95% for objects within a 5 km range. The salient object extraction is achieved through a simplifiedconvolution kernel and the utilization of the object’s RGB features, which offer significantly faster computingspeeds compared to Convolutional Neural Networks (CNNs). Additionally, we introduce the dark channel beforethe fog removal algorithm, resulting in a 20 dB increase in image definition, which significantly benefits distanceestimation. Our system offers the advantages of stability and low device load, making it an asset for public securityaffairs and providing a reference point for future developments in surveillance hardware.展开更多
This paper aims to develop an automatic miscalibration detection and correction framework to maintain accurate calibration of LiDAR and camera for autonomous vehicle after the sensor drift.First,a monitoring algorithm...This paper aims to develop an automatic miscalibration detection and correction framework to maintain accurate calibration of LiDAR and camera for autonomous vehicle after the sensor drift.First,a monitoring algorithm that can continuously detect the miscalibration in each frame is designed,leveraging the rotational motion each individual sensor observes.Then,as sensor drift occurs,the projection constraints between visual feature points and LiDAR 3-D points are used to compute the scaled camera motion,which is further utilized to align the drifted LiDAR scan with the camera image.Finally,the proposed method is sufficiently compared with two representative approaches in the online experiments with varying levels of random drift,then the method is further extended to the offline calibration experiment and is demonstrated by a comparison with two existing benchmark methods.展开更多
基金National Natural Science Foundation of China(41801379)。
文摘RGB-D camera is a new type of sensor,which can obtain the depth and texture information in an unknown 3D scene simultaneously,and they have been applied in various fields widely.In fact,when implementing such kinds of applications using RGB-D camera,it is necessary to calibrate it first.To the best of our knowledge,at present,there is no existing a systemic summary related to RGB-D camera calibration methods.Therefore,a systemic review of RGB-D camera calibration is concluded as follows.Firstly,the mechanism of obtained measurement and the related principle of RGB-D camera calibration methods are presented.Subsequently,as some specific applications need to fuse depth and color information,the calibration methods of relative pose between depth camera and RGB camera are introduced in Section 2.Then the depth correction models within RGB-D cameras are summarized and compared respectively in Section 3.Thirdly,considering that the angle of the view field of RGB-D camera is smaller and limited to some specific applications,we discuss the calibration models of relative pose among multiple RGB-D cameras in Section 4.At last,the direction and trend of RGB-D camera calibration are prospected and concluded.
基金Supported by the National Natural Science Foundation of China(61501034)
文摘In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camera motion of two consecutive RGB-D frames by minimizing the photometric error.To permit outliers and noise,a robust sensor model built upon the t-distribution and an error function mixing depth and photometric errors are used to enhance the accuracy and robustness.Local graph optimization based on key frames is used to reduce the accumulative error and refine the local map.The loop closure detection method,which combines the appearance similarity method and spatial location constraints method,increases the speed of detection.Experimental results demonstrate that the proposed approach achieves higher accuracy on the motion estimation and environment reconstruction compared to the other state-of-the-art methods. Moreover,the proposed approach works in real-time on a laptop without a GPU,which makes it attractive for robots equipped with limited computational resources.
基金the National Natural Science Foundation of China(No.62173228)。
文摘Accurate vehicle localization is a key technology for autonomous driving tasks in indoor parking lots,such as automated valet parking.Additionally,infrastructure-based cooperative driving systems have become a means to realizing intelligent driving.In this paper,we propose a novel and practical vehicle localization system using infrastructure-based RGB-D cameras for indoor parking lots.In the proposed system,we design a depth data preprocessing method with both simplicity and efficiency to reduce the computational burden resulting from a large amount of data.Meanwhile,the hardware synchronization for all cameras in the sensor network is not implemented owing to the disadvantage that it is extremely cumbersome and would significantly reduce the scalability of our system in mass deployments.Hence,to address the problem of data distortion accompanying vehicle motion,we propose a vehicle localization method by performing template point cloud registration in distributed depth data.Finally,a complete hardware system was built to verify the feasibility of our solution in a real-world environment.Experiments in an indoor parking lot demonstrated the effectiveness and accuracy of the proposed vehicle localization system,with a maximum root mean squared error of 5 cm at 15Hz compared with the ground truth.
基金This work was funded by the National Natural Science Foundation of China(Grant No.62172132)Public Welfare Technology Research Project of Zhejiang Province(Grant No.LGF21F020014)the Opening Project of Key Laboratory of Public Security Information Application Based on Big-Data Architecture,Ministry of Public Security of Zhejiang Police College(Grant No.2021DSJSYS002).
文摘The widespread availability of digital multimedia data has led to a new challenge in digital forensics.Traditional source camera identification algorithms usually rely on various traces in the capturing process.However,these traces have become increasingly difficult to extract due to wide availability of various image processing algorithms.Convolutional Neural Networks(CNN)-based algorithms have demonstrated good discriminative capabilities for different brands and even different models of camera devices.However,their performances is not ideal in case of distinguishing between individual devices of the same model,because cameras of the same model typically use the same optical lens,image sensor,and image processing algorithms,that result in minimal overall differences.In this paper,we propose a camera forensics algorithm based on multi-scale feature fusion to address these issues.The proposed algorithm extracts different local features from feature maps of different scales and then fuses them to obtain a comprehensive feature representation.This representation is then fed into a subsequent camera fingerprint classification network.Building upon the Swin-T network,we utilize Transformer Blocks and Graph Convolutional Network(GCN)modules to fuse multi-scale features from different stages of the backbone network.Furthermore,we conduct experiments on established datasets to demonstrate the feasibility and effectiveness of the proposed approach.
基金National Natural Science Foundation of China(NSFC)(No.11775147)Guangdong Basic and Applied Basic Research Foundation(Nos.2019A1515110130 and 2024A1515011832)+1 种基金Shenzhen Key Laboratory of Photonics and Biophotonics(ZDSYS20210623092006020)Shenzhen Science and Technology Program(Nos.JCYJ20210324095007020,JCYJ20200109105201936 and JCYJ20230808105019039).
文摘An ultrafast framing camera with a pulse-dilation device,a microchannel plate(MCP)imager,and an electronic imaging system were reported.The camera achieved a temporal resolution of 10 ps by using a pulse-dilation device and gated MCP imager,and a spatial resolution of 100μm by using an electronic imaging system comprising combined magnetic lenses.The spatial resolution characteristics of the camera were studied both theoretically and experimentally.The results showed that the camera with combined magnetic lenses reduced the field curvature and acquired a larger working area.A working area with a diameter of 53 mm was created by applying four magnetic lenses to the camera.Furthermore,the camera was used to detect the X-rays produced by the laser-targeting device.The diagnostic results indicated that the width of the X-ray pulse was approximately 18 ps.
基金the Social Development Project of Jiangsu Key R&D Program(BE2022680)the National Natural Science Foundation of China(Nos.62371253,52278119).
文摘This paper introduces an intelligent computational approach for extracting salient objects fromimages and estimatingtheir distance information with PTZ (Pan-Tilt-Zoom) cameras. PTZ cameras have found wide applications innumerous public places, serving various purposes such as public securitymanagement, natural disastermonitoring,and crisis alarms, particularly with the rapid development of Artificial Intelligence and global infrastructuralprojects. In this paper, we combine Gauss optical principles with the PTZ camera’s capabilities of horizontal andpitch rotation, as well as optical zoom, to estimate the distance of the object.We present a novel monocular objectdistance estimation model based on the Focal Length-Target Pixel Size (FLTPS) relationship, achieving an accuracyrate of over 95% for objects within a 5 km range. The salient object extraction is achieved through a simplifiedconvolution kernel and the utilization of the object’s RGB features, which offer significantly faster computingspeeds compared to Convolutional Neural Networks (CNNs). Additionally, we introduce the dark channel beforethe fog removal algorithm, resulting in a 20 dB increase in image definition, which significantly benefits distanceestimation. Our system offers the advantages of stability and low device load, making it an asset for public securityaffairs and providing a reference point for future developments in surveillance hardware.
基金Supported by National Natural Science Foundation of China(Grant Nos.52025121,52394263)National Key R&D Plan of China(Grant No.2023YFD2000301).
文摘This paper aims to develop an automatic miscalibration detection and correction framework to maintain accurate calibration of LiDAR and camera for autonomous vehicle after the sensor drift.First,a monitoring algorithm that can continuously detect the miscalibration in each frame is designed,leveraging the rotational motion each individual sensor observes.Then,as sensor drift occurs,the projection constraints between visual feature points and LiDAR 3-D points are used to compute the scaled camera motion,which is further utilized to align the drifted LiDAR scan with the camera image.Finally,the proposed method is sufficiently compared with two representative approaches in the online experiments with varying levels of random drift,then the method is further extended to the offline calibration experiment and is demonstrated by a comparison with two existing benchmark methods.