Estimating depth from images captured by camera sensors is crucial for the advancement of autonomous driving technologies and has gained significant attention in recent years.However,most previous methods rely on stac...Estimating depth from images captured by camera sensors is crucial for the advancement of autonomous driving technologies and has gained significant attention in recent years.However,most previous methods rely on stacked pooling or stride convolution to extract high-level features,which can limit network performance and lead to information redundancy.This paper proposes an improved bidirectional feature pyramid module(BiFPN)and a channel attention module(Seblock:squeeze and excitation)to address these issues in existing methods based on monocular camera sensor.The Seblock redistributes channel feature weights to enhance useful information,while the improved BiFPN facilitates efficient fusion of multi-scale features.The proposed method is in an end-to-end solution without any additional post-processing,resulting in efficient depth estimation.Experiment results show that the proposed method is competitive with state-of-the-art algorithms and preserves fine-grained texture of scene depth.展开更多
The need for efficient and reproducible development processes for sensor and perception systems is growing with their increased use in modern vehicles. Such processes can be achieved by using virtual test environments...The need for efficient and reproducible development processes for sensor and perception systems is growing with their increased use in modern vehicles. Such processes can be achieved by using virtual test environments and virtual sensor models. In the context of this, the present paper documents the development of a sensor model for depth estimation of virtual three-dimensional scenarios. For this purpose, the geometric and algorithmic principles of stereoscopic camera systems are recreated in a virtual form. The model is implemented as a subroutine in the Epic Games Unreal Engine, which is one of the most common Game Engines. Its architecture consists of several independent procedures that enable a local depth estimation, but also a reconstruction of a whole three-dimensional scenery. In addition, a separate programme for calibrating the model is presented. In addition to the basic principles, the architecture and the implementation, this work also documents the evaluation of the model created. It is shown that the model meets specifically defined requirements for real-time capability and the accuracy of the evaluation. Thus, it is suitable for the virtual testing of common algorithms and highly automated driving functions.展开更多
Most sensors or cameras discussed in the sensor network community are usually 3D homogeneous, even though their2 D coverage areas in the ground plane are heterogeneous. Meanwhile, observed objects of camera networks a...Most sensors or cameras discussed in the sensor network community are usually 3D homogeneous, even though their2 D coverage areas in the ground plane are heterogeneous. Meanwhile, observed objects of camera networks are usually simplified as 2D points in previous literature. However in actual application scenes, not only cameras are always heterogeneous with different height and action radiuses, but also the observed objects are with 3D features(i.e., height). This paper presents a sensor planning formulation addressing the efficiency enhancement of visual tracking in 3D heterogeneous camera networks that track and detect people traversing a region. The problem of sensor planning consists of three issues:(i) how to model the 3D heterogeneous cameras;(ii) how to rank the visibility, which ensures that the object of interest is visible in a camera's field of view;(iii) how to reconfigure the 3D viewing orientations of the cameras. This paper studies the geometric properties of 3D heterogeneous camera networks and addresses an evaluation formulation to rank the visibility of observed objects. Then a sensor planning method is proposed to improve the efficiency of visual tracking. Finally, the numerical results show that the proposed method can improve the tracking performance of the system compared to the conventional strategies.展开更多
The presence of increased memory and computational power in imaging sensor networks attracts researchers to exploit image processing algorithms on distributed memory and computational power. In this paper, a typical p...The presence of increased memory and computational power in imaging sensor networks attracts researchers to exploit image processing algorithms on distributed memory and computational power. In this paper, a typical perimeter is investigated with a number of sensors placed to form an image sensor network for the purpose of content based distributed image search. Image search algorithm is used to enable distributed content based image search within each sensor node. The energy model is presented to calculate energy efficiency for various cases of image search and transmission. The simulations are carried out based on consideration of continuous monitoring or event driven activity on the perimeter. The simulation setups consider distributed image processing on sensor nodes and results show that energy saving is significant if search algorithms are embedded in image sensor nodes and image processing is distributed across sensor nodes. The tradeoff between sensor life time, distributed image search and network deployed cost is also investigated.展开更多
Due to the electronic rolling shutter, high-speed Complementary Metal-Oxide Semiconductor( CMOS) aerial cameras are generally subject to geometric distortions,which cannot be perfectly corrected by conventional vision...Due to the electronic rolling shutter, high-speed Complementary Metal-Oxide Semiconductor( CMOS) aerial cameras are generally subject to geometric distortions,which cannot be perfectly corrected by conventional vision-based algorithms. In this paper we propose a novel approach to address the problem of rolling shutter distortion in aerial imaging. A mathematical model is established by the coordinate transformation method. It can directly calculate the pixel distortion when an aerial camera is imaging at arbitrary gesture angles.Then all pixel distortions form a distortion map over the whole CMOS array and the map is exploited in the image rectification process incorporating reverse projection. The error analysis indicates that within the margin of measuring errors,the final calculation error of our model is less than 1/2 pixel. The experimental results show that our approach yields good rectification performance in a series of images with different distortions. We demonstrate that our method outperforms other vision-based algorithms in terms of the computational complexity,which makes it more suitable for aerial real-time imaging.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.52272421)Shenzhen Fundamental Research Fund(Grant Number:JCYJ20190808142613246 and 20200803015912001).
文摘Estimating depth from images captured by camera sensors is crucial for the advancement of autonomous driving technologies and has gained significant attention in recent years.However,most previous methods rely on stacked pooling or stride convolution to extract high-level features,which can limit network performance and lead to information redundancy.This paper proposes an improved bidirectional feature pyramid module(BiFPN)and a channel attention module(Seblock:squeeze and excitation)to address these issues in existing methods based on monocular camera sensor.The Seblock redistributes channel feature weights to enhance useful information,while the improved BiFPN facilitates efficient fusion of multi-scale features.The proposed method is in an end-to-end solution without any additional post-processing,resulting in efficient depth estimation.Experiment results show that the proposed method is competitive with state-of-the-art algorithms and preserves fine-grained texture of scene depth.
文摘The need for efficient and reproducible development processes for sensor and perception systems is growing with their increased use in modern vehicles. Such processes can be achieved by using virtual test environments and virtual sensor models. In the context of this, the present paper documents the development of a sensor model for depth estimation of virtual three-dimensional scenarios. For this purpose, the geometric and algorithmic principles of stereoscopic camera systems are recreated in a virtual form. The model is implemented as a subroutine in the Epic Games Unreal Engine, which is one of the most common Game Engines. Its architecture consists of several independent procedures that enable a local depth estimation, but also a reconstruction of a whole three-dimensional scenery. In addition, a separate programme for calibrating the model is presented. In addition to the basic principles, the architecture and the implementation, this work also documents the evaluation of the model created. It is shown that the model meets specifically defined requirements for real-time capability and the accuracy of the evaluation. Thus, it is suitable for the virtual testing of common algorithms and highly automated driving functions.
基金supported by the National Natural Science Foundationof China(61100207)the National Key Technology Research and Development Program of the Ministry of Science and Technology of China(2014BAK14B03)+1 种基金the Fundamental Research Funds for the Central Universities(2013PT132013XZ12)
文摘Most sensors or cameras discussed in the sensor network community are usually 3D homogeneous, even though their2 D coverage areas in the ground plane are heterogeneous. Meanwhile, observed objects of camera networks are usually simplified as 2D points in previous literature. However in actual application scenes, not only cameras are always heterogeneous with different height and action radiuses, but also the observed objects are with 3D features(i.e., height). This paper presents a sensor planning formulation addressing the efficiency enhancement of visual tracking in 3D heterogeneous camera networks that track and detect people traversing a region. The problem of sensor planning consists of three issues:(i) how to model the 3D heterogeneous cameras;(ii) how to rank the visibility, which ensures that the object of interest is visible in a camera's field of view;(iii) how to reconfigure the 3D viewing orientations of the cameras. This paper studies the geometric properties of 3D heterogeneous camera networks and addresses an evaluation formulation to rank the visibility of observed objects. Then a sensor planning method is proposed to improve the efficiency of visual tracking. Finally, the numerical results show that the proposed method can improve the tracking performance of the system compared to the conventional strategies.
文摘The presence of increased memory and computational power in imaging sensor networks attracts researchers to exploit image processing algorithms on distributed memory and computational power. In this paper, a typical perimeter is investigated with a number of sensors placed to form an image sensor network for the purpose of content based distributed image search. Image search algorithm is used to enable distributed content based image search within each sensor node. The energy model is presented to calculate energy efficiency for various cases of image search and transmission. The simulations are carried out based on consideration of continuous monitoring or event driven activity on the perimeter. The simulation setups consider distributed image processing on sensor nodes and results show that energy saving is significant if search algorithms are embedded in image sensor nodes and image processing is distributed across sensor nodes. The tradeoff between sensor life time, distributed image search and network deployed cost is also investigated.
基金Sponsored by the National Natural Science Foundation of China(Grant No.60902067)the Foundation for Science & Technology Research Project of Jilin Province(Grant No.11ZDGG001)
文摘Due to the electronic rolling shutter, high-speed Complementary Metal-Oxide Semiconductor( CMOS) aerial cameras are generally subject to geometric distortions,which cannot be perfectly corrected by conventional vision-based algorithms. In this paper we propose a novel approach to address the problem of rolling shutter distortion in aerial imaging. A mathematical model is established by the coordinate transformation method. It can directly calculate the pixel distortion when an aerial camera is imaging at arbitrary gesture angles.Then all pixel distortions form a distortion map over the whole CMOS array and the map is exploited in the image rectification process incorporating reverse projection. The error analysis indicates that within the margin of measuring errors,the final calculation error of our model is less than 1/2 pixel. The experimental results show that our approach yields good rectification performance in a series of images with different distortions. We demonstrate that our method outperforms other vision-based algorithms in terms of the computational complexity,which makes it more suitable for aerial real-time imaging.