Virtual reality(VR)and augmented reality(AR)are revolutionizing our lives.Near-eye displays are crucial technologies for VR and AR.Despite the rapid advances in near-eye display technologies,there are still challenges...Virtual reality(VR)and augmented reality(AR)are revolutionizing our lives.Near-eye displays are crucial technologies for VR and AR.Despite the rapid advances in near-eye display technologies,there are still challenges such as large field of view,high resolution,high image quality,natural free 3D effect,and compact form factor.Great efforts have been devoted to striking a balance between visual performance and device compactness.While traditional optics are nearing their limitations in addressing these challenges,ultra-thin metasurface optics,with their high light-modulating capabilities,may present a promising solution.In this review,we first introduce VR and AR near-eye displays,and then briefly explain the working principles of light-modulating metasurfaces,review recent developments in metasurface devices geared toward near-eye display applications,delved into several advanced natural 3D near-eye display technologies based on metasurfaces,and finally discuss about the remaining challenges and future perspectives associated with metasurfaces for near-eye display applications.展开更多
The widespread availability of digital multimedia data has led to a new challenge in digital forensics.Traditional source camera identification algorithms usually rely on various traces in the capturing process.Howeve...The widespread availability of digital multimedia data has led to a new challenge in digital forensics.Traditional source camera identification algorithms usually rely on various traces in the capturing process.However,these traces have become increasingly difficult to extract due to wide availability of various image processing algorithms.Convolutional Neural Networks(CNN)-based algorithms have demonstrated good discriminative capabilities for different brands and even different models of camera devices.However,their performances is not ideal in case of distinguishing between individual devices of the same model,because cameras of the same model typically use the same optical lens,image sensor,and image processing algorithms,that result in minimal overall differences.In this paper,we propose a camera forensics algorithm based on multi-scale feature fusion to address these issues.The proposed algorithm extracts different local features from feature maps of different scales and then fuses them to obtain a comprehensive feature representation.This representation is then fed into a subsequent camera fingerprint classification network.Building upon the Swin-T network,we utilize Transformer Blocks and Graph Convolutional Network(GCN)modules to fuse multi-scale features from different stages of the backbone network.Furthermore,we conduct experiments on established datasets to demonstrate the feasibility and effectiveness of the proposed approach.展开更多
An ultrafast framing camera with a pulse-dilation device,a microchannel plate(MCP)imager,and an electronic imaging system were reported.The camera achieved a temporal resolution of 10 ps by using a pulse-dilation devi...An ultrafast framing camera with a pulse-dilation device,a microchannel plate(MCP)imager,and an electronic imaging system were reported.The camera achieved a temporal resolution of 10 ps by using a pulse-dilation device and gated MCP imager,and a spatial resolution of 100μm by using an electronic imaging system comprising combined magnetic lenses.The spatial resolution characteristics of the camera were studied both theoretically and experimentally.The results showed that the camera with combined magnetic lenses reduced the field curvature and acquired a larger working area.A working area with a diameter of 53 mm was created by applying four magnetic lenses to the camera.Furthermore,the camera was used to detect the X-rays produced by the laser-targeting device.The diagnostic results indicated that the width of the X-ray pulse was approximately 18 ps.展开更多
This paper introduces an intelligent computational approach for extracting salient objects fromimages and estimatingtheir distance information with PTZ (Pan-Tilt-Zoom) cameras. PTZ cameras have found wide applications...This paper introduces an intelligent computational approach for extracting salient objects fromimages and estimatingtheir distance information with PTZ (Pan-Tilt-Zoom) cameras. PTZ cameras have found wide applications innumerous public places, serving various purposes such as public securitymanagement, natural disastermonitoring,and crisis alarms, particularly with the rapid development of Artificial Intelligence and global infrastructuralprojects. In this paper, we combine Gauss optical principles with the PTZ camera’s capabilities of horizontal andpitch rotation, as well as optical zoom, to estimate the distance of the object.We present a novel monocular objectdistance estimation model based on the Focal Length-Target Pixel Size (FLTPS) relationship, achieving an accuracyrate of over 95% for objects within a 5 km range. The salient object extraction is achieved through a simplifiedconvolution kernel and the utilization of the object’s RGB features, which offer significantly faster computingspeeds compared to Convolutional Neural Networks (CNNs). Additionally, we introduce the dark channel beforethe fog removal algorithm, resulting in a 20 dB increase in image definition, which significantly benefits distanceestimation. Our system offers the advantages of stability and low device load, making it an asset for public securityaffairs and providing a reference point for future developments in surveillance hardware.展开更多
This paper aims to develop an automatic miscalibration detection and correction framework to maintain accurate calibration of LiDAR and camera for autonomous vehicle after the sensor drift.First,a monitoring algorithm...This paper aims to develop an automatic miscalibration detection and correction framework to maintain accurate calibration of LiDAR and camera for autonomous vehicle after the sensor drift.First,a monitoring algorithm that can continuously detect the miscalibration in each frame is designed,leveraging the rotational motion each individual sensor observes.Then,as sensor drift occurs,the projection constraints between visual feature points and LiDAR 3-D points are used to compute the scaled camera motion,which is further utilized to align the drifted LiDAR scan with the camera image.Finally,the proposed method is sufficiently compared with two representative approaches in the online experiments with varying levels of random drift,then the method is further extended to the offline calibration experiment and is demonstrated by a comparison with two existing benchmark methods.展开更多
The geometric accuracy of topographic mapping with high-resolution remote sensing images is inevita-bly affected by the orbiter attitude jitter.Therefore,it is necessary to conduct preliminary research on the stereo m...The geometric accuracy of topographic mapping with high-resolution remote sensing images is inevita-bly affected by the orbiter attitude jitter.Therefore,it is necessary to conduct preliminary research on the stereo mapping camera equipped on lunar orbiter before launching.In this work,an imaging simulation method consid-ering the attitude jitter is presented.The impact analysis of different attitude jitter on terrain undulation is conduct-ed by simulating jitter at three attitude angles,respectively.The proposed simulation method is based on the rigor-ous sensor model,using the lunar digital elevation model(DEM)and orthoimage as reference data.The orbit and attitude of the lunar stereo mapping camera are simulated while considering the attitude jitter.Two-dimensional simulated stereo images are generated according to the position and attitude of the orbiter in a given orbit.Experi-mental analyses were conducted by the DEM with the simulated stereo image.The simulation imaging results demonstrate that the proposed method can ensure imaging efficiency without losing the accuracy of topographic mapping.The effect of attitude jitter on the stereo mapping accuracy of the simulated images was analyzed through a DEM comparison.展开更多
Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance o...Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance of robotic applications in terms of accuracy and speed.This research proposed a real-time indoor camera localization system based on a recurrent neural network that detects scene change during the image sequence.An annotated image dataset trains the proposed system and predicts the camera pose in real-time.The system mainly improved the localization performance of indoor cameras by more accurately predicting the camera pose.It also recognizes the scene changes during the sequence and evaluates the effects of these changes.This system achieved high accuracy and real-time performance.The scene change detection process was performed using visual rhythm and the proposed recurrent deep architecture,which performed camera pose prediction and scene change impact evaluation.Overall,this study proposed a novel real-time localization system for indoor cameras that detects scene changes and shows how they affect localization performance.展开更多
为实现远距离、高可靠性传输,并减小复杂度,对Camera Link Full接口数据的HD-SDI传输显示进行了深入研究。采用FPGA作为核心处理器,考虑相机输出具有多种帧频,采取帧频检测及充分降频策略,并通过3个SRAM进行缓存以实现帧频转换,以满足HD...为实现远距离、高可靠性传输,并减小复杂度,对Camera Link Full接口数据的HD-SDI传输显示进行了深入研究。采用FPGA作为核心处理器,考虑相机输出具有多种帧频,采取帧频检测及充分降频策略,并通过3个SRAM进行缓存以实现帧频转换,以满足HD-SDI帧频25Hz的要求。考虑到SRAM数据宽度,采取FIFO行缓存策略将Camera Link Full80输出的10tap、80bits图像数据转换成单通道的8bits图像数据。最后,完成系统设计并进行实验验证。实验结果表明:系统实现了图像数据从50Hz、100Hz、500 Hz等多种帧频的Camera Link Full80到25帧HD-SDI接口1080i的格式转换及实时显示,且图像层次丰富,无失真。展开更多
基金supports from the National Key Research and Development Program of China (2021YFB2802100)the National Natural Science Foundation of China (62075127 and 62105203).
文摘Virtual reality(VR)and augmented reality(AR)are revolutionizing our lives.Near-eye displays are crucial technologies for VR and AR.Despite the rapid advances in near-eye display technologies,there are still challenges such as large field of view,high resolution,high image quality,natural free 3D effect,and compact form factor.Great efforts have been devoted to striking a balance between visual performance and device compactness.While traditional optics are nearing their limitations in addressing these challenges,ultra-thin metasurface optics,with their high light-modulating capabilities,may present a promising solution.In this review,we first introduce VR and AR near-eye displays,and then briefly explain the working principles of light-modulating metasurfaces,review recent developments in metasurface devices geared toward near-eye display applications,delved into several advanced natural 3D near-eye display technologies based on metasurfaces,and finally discuss about the remaining challenges and future perspectives associated with metasurfaces for near-eye display applications.
基金This work was funded by the National Natural Science Foundation of China(Grant No.62172132)Public Welfare Technology Research Project of Zhejiang Province(Grant No.LGF21F020014)the Opening Project of Key Laboratory of Public Security Information Application Based on Big-Data Architecture,Ministry of Public Security of Zhejiang Police College(Grant No.2021DSJSYS002).
文摘The widespread availability of digital multimedia data has led to a new challenge in digital forensics.Traditional source camera identification algorithms usually rely on various traces in the capturing process.However,these traces have become increasingly difficult to extract due to wide availability of various image processing algorithms.Convolutional Neural Networks(CNN)-based algorithms have demonstrated good discriminative capabilities for different brands and even different models of camera devices.However,their performances is not ideal in case of distinguishing between individual devices of the same model,because cameras of the same model typically use the same optical lens,image sensor,and image processing algorithms,that result in minimal overall differences.In this paper,we propose a camera forensics algorithm based on multi-scale feature fusion to address these issues.The proposed algorithm extracts different local features from feature maps of different scales and then fuses them to obtain a comprehensive feature representation.This representation is then fed into a subsequent camera fingerprint classification network.Building upon the Swin-T network,we utilize Transformer Blocks and Graph Convolutional Network(GCN)modules to fuse multi-scale features from different stages of the backbone network.Furthermore,we conduct experiments on established datasets to demonstrate the feasibility and effectiveness of the proposed approach.
基金National Natural Science Foundation of China(NSFC)(No.11775147)Guangdong Basic and Applied Basic Research Foundation(Nos.2019A1515110130 and 2024A1515011832)+1 种基金Shenzhen Key Laboratory of Photonics and Biophotonics(ZDSYS20210623092006020)Shenzhen Science and Technology Program(Nos.JCYJ20210324095007020,JCYJ20200109105201936 and JCYJ20230808105019039).
文摘An ultrafast framing camera with a pulse-dilation device,a microchannel plate(MCP)imager,and an electronic imaging system were reported.The camera achieved a temporal resolution of 10 ps by using a pulse-dilation device and gated MCP imager,and a spatial resolution of 100μm by using an electronic imaging system comprising combined magnetic lenses.The spatial resolution characteristics of the camera were studied both theoretically and experimentally.The results showed that the camera with combined magnetic lenses reduced the field curvature and acquired a larger working area.A working area with a diameter of 53 mm was created by applying four magnetic lenses to the camera.Furthermore,the camera was used to detect the X-rays produced by the laser-targeting device.The diagnostic results indicated that the width of the X-ray pulse was approximately 18 ps.
基金the Social Development Project of Jiangsu Key R&D Program(BE2022680)the National Natural Science Foundation of China(Nos.62371253,52278119).
文摘This paper introduces an intelligent computational approach for extracting salient objects fromimages and estimatingtheir distance information with PTZ (Pan-Tilt-Zoom) cameras. PTZ cameras have found wide applications innumerous public places, serving various purposes such as public securitymanagement, natural disastermonitoring,and crisis alarms, particularly with the rapid development of Artificial Intelligence and global infrastructuralprojects. In this paper, we combine Gauss optical principles with the PTZ camera’s capabilities of horizontal andpitch rotation, as well as optical zoom, to estimate the distance of the object.We present a novel monocular objectdistance estimation model based on the Focal Length-Target Pixel Size (FLTPS) relationship, achieving an accuracyrate of over 95% for objects within a 5 km range. The salient object extraction is achieved through a simplifiedconvolution kernel and the utilization of the object’s RGB features, which offer significantly faster computingspeeds compared to Convolutional Neural Networks (CNNs). Additionally, we introduce the dark channel beforethe fog removal algorithm, resulting in a 20 dB increase in image definition, which significantly benefits distanceestimation. Our system offers the advantages of stability and low device load, making it an asset for public securityaffairs and providing a reference point for future developments in surveillance hardware.
基金Supported by National Natural Science Foundation of China(Grant Nos.52025121,52394263)National Key R&D Plan of China(Grant No.2023YFD2000301).
文摘This paper aims to develop an automatic miscalibration detection and correction framework to maintain accurate calibration of LiDAR and camera for autonomous vehicle after the sensor drift.First,a monitoring algorithm that can continuously detect the miscalibration in each frame is designed,leveraging the rotational motion each individual sensor observes.Then,as sensor drift occurs,the projection constraints between visual feature points and LiDAR 3-D points are used to compute the scaled camera motion,which is further utilized to align the drifted LiDAR scan with the camera image.Finally,the proposed method is sufficiently compared with two representative approaches in the online experiments with varying levels of random drift,then the method is further extended to the offline calibration experiment and is demonstrated by a comparison with two existing benchmark methods.
基金Supported by the National Natural Science Foundation of China(42221002,42171432)Shanghai Municipal Science and Technology Major Project(2021SHZDZX0100)the Fundamental Research Funds for the Central Universities.
文摘The geometric accuracy of topographic mapping with high-resolution remote sensing images is inevita-bly affected by the orbiter attitude jitter.Therefore,it is necessary to conduct preliminary research on the stereo mapping camera equipped on lunar orbiter before launching.In this work,an imaging simulation method consid-ering the attitude jitter is presented.The impact analysis of different attitude jitter on terrain undulation is conduct-ed by simulating jitter at three attitude angles,respectively.The proposed simulation method is based on the rigor-ous sensor model,using the lunar digital elevation model(DEM)and orthoimage as reference data.The orbit and attitude of the lunar stereo mapping camera are simulated while considering the attitude jitter.Two-dimensional simulated stereo images are generated according to the position and attitude of the orbiter in a given orbit.Experi-mental analyses were conducted by the DEM with the simulated stereo image.The simulation imaging results demonstrate that the proposed method can ensure imaging efficiency without losing the accuracy of topographic mapping.The effect of attitude jitter on the stereo mapping accuracy of the simulated images was analyzed through a DEM comparison.
文摘Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance of robotic applications in terms of accuracy and speed.This research proposed a real-time indoor camera localization system based on a recurrent neural network that detects scene change during the image sequence.An annotated image dataset trains the proposed system and predicts the camera pose in real-time.The system mainly improved the localization performance of indoor cameras by more accurately predicting the camera pose.It also recognizes the scene changes during the sequence and evaluates the effects of these changes.This system achieved high accuracy and real-time performance.The scene change detection process was performed using visual rhythm and the proposed recurrent deep architecture,which performed camera pose prediction and scene change impact evaluation.Overall,this study proposed a novel real-time localization system for indoor cameras that detects scene changes and shows how they affect localization performance.
文摘为实现远距离、高可靠性传输,并减小复杂度,对Camera Link Full接口数据的HD-SDI传输显示进行了深入研究。采用FPGA作为核心处理器,考虑相机输出具有多种帧频,采取帧频检测及充分降频策略,并通过3个SRAM进行缓存以实现帧频转换,以满足HD-SDI帧频25Hz的要求。考虑到SRAM数据宽度,采取FIFO行缓存策略将Camera Link Full80输出的10tap、80bits图像数据转换成单通道的8bits图像数据。最后,完成系统设计并进行实验验证。实验结果表明:系统实现了图像数据从50Hz、100Hz、500 Hz等多种帧频的Camera Link Full80到25帧HD-SDI接口1080i的格式转换及实时显示,且图像层次丰富,无失真。