期刊文献+
共找到27篇文章
< 1 2 >
每页显示 20 50 100
Rail-Pillar Net:A 3D Detection Network for Railway Foreign Object Based on LiDAR
1
作者 Fan Li Shuyao Zhang +2 位作者 Jie Yang Zhicheng Feng Zhichao Chen 《Computers, Materials & Continua》 SCIE EI 2024年第9期3819-3833,共15页
Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,w... Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,we propose Rail-PillarNet,a three-dimensional(3D)LIDAR(Light Detection and Ranging)railway foreign object detection method based on the improvement of PointPillars.Firstly,the parallel attention pillar encoder(PAPE)is designed to fully extract the features of the pillars and alleviate the problem of local fine-grained information loss in PointPillars pillars encoder.Secondly,a fine backbone network is designed to improve the feature extraction capability of the network by combining the coding characteristics of LIDAR point cloud feature and residual structure.Finally,the initial weight parameters of the model were optimised by the transfer learning training method to further improve accuracy.The experimental results on the OSDaR23 dataset show that the average accuracy of Rail-PillarNet reaches 58.51%,which is higher than most mainstream models,and the number of parameters is 5.49 M.Compared with PointPillars,the accuracy of each target is improved by 10.94%,3.53%,16.96%and 19.90%,respectively,and the number of parameters only increases by 0.64M,which achieves a balance between the number of parameters and accuracy. 展开更多
关键词 Railway foreign object light detection and ranging(LidAR) 3d object detection PointPillars parallel attention mechanism transfer learning
下载PDF
Depth-Guided Vision Transformer With Normalizing Flows for Monocular 3D Object Detection
2
作者 Cong Pan Junran Peng Zhaoxiang Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第3期673-689,共17页
Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input t... Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts. 展开更多
关键词 Monocular 3d object detection normalizing flows Swin Transformer
下载PDF
MFF-Net: Multimodal Feature Fusion Network for 3D Object Detection
3
作者 Peicheng Shi Zhiqiang Liu +1 位作者 Heng Qi Aixi Yang 《Computers, Materials & Continua》 SCIE EI 2023年第6期5615-5637,共23页
In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection ... In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection will be affected by problems such as illumination changes,object occlusion,and object detection distance.To this purpose,we face these challenges by proposing a multimodal feature fusion network for 3D object detection(MFF-Net).In this research,this paper first uses the spatial transformation projection algorithm to map the image features into the feature space,so that the image features are in the same spatial dimension when fused with the point cloud features.Then,feature channel weighting is performed using an adaptive expression augmentation fusion network to enhance important network features,suppress useless features,and increase the directionality of the network to features.Finally,this paper increases the probability of false detection and missed detection in the non-maximum suppression algo-rithm by increasing the one-dimensional threshold.So far,this paper has constructed a complete 3D target detection network based on multimodal feature fusion.The experimental results show that the proposed achieves an average accuracy of 82.60%on the Karlsruhe Institute of Technology and Toyota Technological Institute(KITTI)dataset,outperforming previous state-of-the-art multimodal fusion networks.In Easy,Moderate,and hard evaluation indicators,the accuracy rate of this paper reaches 90.96%,81.46%,and 75.39%.This shows that the MFF-Net network has good performance in 3D object detection. 展开更多
关键词 3d object detection multimodal fusion neural network autonomous driving attention mechanism
下载PDF
Monocular 3D object detection with Pseudo-LiDAR confidence sampling and hierarchical geometric feature extraction in 6G network
4
作者 Jianlong Zhang Guangzu Fang +3 位作者 Bin Wang Xiaobo Zhou Qingqi Pei Chen Chen 《Digital Communications and Networks》 SCIE CSCD 2023年第4期827-835,共9页
The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpow... The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpower solution compared to LiDAR solutions in the field of autonomous driving.However,this technique has some problems,i.e.,(1)the poor quality of generated Pseudo-LiDAR point clouds resulting from the nonlinear error distribution of monocular depth estimation and(2)the weak representation capability of point cloud features due to the neglected global geometric structure features of point clouds existing in LiDAR-based 3D detection networks.Therefore,we proposed a Pseudo-LiDAR confidence sampling strategy and a hierarchical geometric feature extraction module for monocular 3D object detection.We first designed a point cloud confidence sampling strategy based on a 3D Gaussian distribution to assign small confidence to the points with great error in depth estimation and filter them out according to the confidence.Then,we present a hierarchical geometric feature extraction module by aggregating the local neighborhood features and a dual transformer to capture the global geometric features in the point cloud.Finally,our detection framework is based on Point-Voxel-RCNN(PV-RCNN)with high-quality Pseudo-LiDAR and enriched geometric features as input.From the experimental results,our method achieves satisfactory results in monocular 3D object detection. 展开更多
关键词 Monocular 3d object detection Pseudo-LidAR Confidence sampling Hierarchical geometric feature extraction
下载PDF
3D Object Detection with Attention:Shell-Based Modeling
5
作者 Xiaorui Zhang Ziquan Zhao +1 位作者 Wei Sun Qi Cui 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期537-550,共14页
LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previou... LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previous object detection methods,due to the pre-processing of the original LIDAR point cloud into voxels or pillars,lose the coordinate information of the original point cloud,slow detection speed,and gain inaccurate bounding box positioning.To address the issues above,this study proposes a new two-stage network structure to extract point cloud features directly by PointNet++,which effectively preserves the original point cloud coordinate information.To improve the detection accuracy,a shell-based modeling method is proposed.It roughly determines which spherical shell the coordinates belong to.Then,the results are refined to ground truth,thereby narrowing the localization range and improving the detection accuracy.To improve the recall of 3D object detection with bounding boxes,this paper designs a self-attention module for 3D object detection with a skip connection structure.Some of these features are highlighted by weighting them on the feature dimensions.After training,it makes the feature weights that are favorable for object detection get larger.Thus,the extracted features are more adapted to the object detection task.Extensive comparison experiments and ablation experiments conducted on the KITTI dataset verify the effectiveness of our proposed method in improving recall and precision. 展开更多
关键词 3d object detection autonomous driving point cloud shell-based modeling self-attention mechanism
下载PDF
Point Cloud Processing Methods for 3D Point Cloud Detection Tasks
6
作者 WANG Chongchong LI Yao +2 位作者 WANG Beibei CAO Hong ZHANG Yanyong 《ZTE Communications》 2023年第4期38-46,共9页
Light detection and ranging(LiDAR)sensors play a vital role in acquiring 3D point cloud data and extracting valuable information about objects for tasks such as autonomous driving,robotics,and virtual reality(VR).Howe... Light detection and ranging(LiDAR)sensors play a vital role in acquiring 3D point cloud data and extracting valuable information about objects for tasks such as autonomous driving,robotics,and virtual reality(VR).However,the sparse and disordered nature of the 3D point cloud poses significant challenges to feature extraction.Overcoming limitations is critical for 3D point cloud processing.3D point cloud object detection is a very challenging and crucial task,in which point cloud processing and feature extraction methods play a crucial role and have a significant impact on subsequent object detection performance.In this overview of outstanding work in object detection from the 3D point cloud,we specifically focus on summarizing methods employed in 3D point cloud processing.We introduce the way point clouds are processed in classical 3D object detection algorithms,and their improvements to solve the problems existing in point cloud processing.Different voxelization methods and point cloud sampling strategies will influence the extracted features,thereby impacting the final detection performance. 展开更多
关键词 point cloud processing 3d object detection point cloud voxelization bird's eye view deep learning
下载PDF
General and robust voxel feature learning with Transformer for 3D object detection 被引量:1
7
作者 LI Yang GE Hongwei 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2022年第1期51-60,共10页
The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.I... The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.Inspired by the great progress of Transformer,we propose a novel general and robust voxel feature encoder for 3D object detection based on the traditional Transformer.We first investigate the permutation invariance of sequence data of the self-attention and apply it to point cloud processing.Then we construct a voxel feature layer based on the self-attention to adaptively learn local and robust context of a voxel according to the spatial relationship and context information exchanging between all points within the voxel.Lastly,we construct a general voxel feature learning framework with the voxel feature layer as the core for 3D object detection.The voxel feature with Transformer(VFT)can be plugged into any other voxel-based 3D object detection framework easily,and serves as the backbone for voxel feature extractor.Experiments results on the KITTI dataset demonstrate that our method achieves the state-of-the-art performance on 3D object detection. 展开更多
关键词 3d object detection self-attention networks voxel feature with Transformer(VFT) point cloud encoder-decoder
下载PDF
Adaptive multi-modal feature fusion for far and hard object detection
8
作者 LI Yang GE Hongwei 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2021年第2期232-241,共10页
In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is pro... In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is proposed,which makes use of multi-neighborhood information of voxel and image information.Firstly,design an improved ResNet that maintains the structure information of far and hard objects in low-resolution feature maps,which is more suitable for detection task.Meanwhile,semantema of each image feature map is enhanced by semantic information from all subsequent feature maps.Secondly,extract multi-neighborhood context information with different receptive field sizes to make up for the defect of sparseness of point cloud which improves the ability of voxel features to represent the spatial structure and semantic information of objects.Finally,propose a multi-modal feature adaptive fusion strategy which uses learnable weights to express the contribution of different modal features to the detection task,and voxel attention further enhances the fused feature expression of effective target objects.The experimental results on the KITTI benchmark show that this method outperforms VoxelNet with remarkable margins,i.e.increasing the AP by 8.78%and 5.49%on medium and hard difficulty levels.Meanwhile,our method achieves greater detection performance compared with many mainstream multi-modal methods,i.e.outperforming the AP by 1%compared with that of MVX-Net on medium and hard difficulty levels. 展开更多
关键词 3d object detection adaptive fusion multi-modal data fusion attention mechanism multi-neighborhood features
下载PDF
Image attention transformer network for indoor 3D object detection
9
作者 REN KeYan YAN Tong +2 位作者 HU ZhaoXin HAN HongGui ZHANG YunLu 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2024年第7期2176-2190,共15页
Point clouds and RGB images are both critical data for 3D object detection. While recent multi-modal methods combine them directly and show remarkable performances, they ignore the distinct forms of these two types of... Point clouds and RGB images are both critical data for 3D object detection. While recent multi-modal methods combine them directly and show remarkable performances, they ignore the distinct forms of these two types of data. For mitigating the influence of this intrinsic difference on performance, we propose a novel but effective fusion model named LI-Attention model, which takes both RGB features and point cloud features into consideration and assigns a weight to each RGB feature by attention mechanism.Furthermore, based on the LI-Attention model, we propose a 3D object detection method called image attention transformer network(IAT-Net) specialized for indoor RGB-D scene. Compared with previous work on multi-modal detection, IAT-Net fuses elaborate RGB features from 2D detection results with point cloud features in attention mechanism, meanwhile generates and refines 3D detection results with transformer model. Extensive experiments demonstrate that our approach outperforms stateof-the-art performance on two widely used benchmarks of indoor 3D object detection, SUN RGB-D and NYU Depth V2, while ablation studies have been provided to analyze the effect of each module. And the source code for the proposed IAT-Net is publicly available at https://github.com/wisper181/IAT-Net. 展开更多
关键词 3d object detection TRANSFORMER attention mechanism
原文传递
Development of vehicle-recognition method on water surfaces using LiDAR data:SPD^(2)(spherically stratified point projection with diameter and distance)
10
作者 Eon-ho Lee Hyeon Jun Jeon +2 位作者 Jinwoo Choi Hyun-Taek Choi Sejin Lee 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第6期95-104,共10页
Swarm robot systems are an important application of autonomous unmanned surface vehicles on water surfaces.For monitoring natural environments and conducting security activities within a certain range using a surface ... Swarm robot systems are an important application of autonomous unmanned surface vehicles on water surfaces.For monitoring natural environments and conducting security activities within a certain range using a surface vehicle,the swarm robot system is more efficient than the operation of a single object as the former can reduce cost and save time.It is necessary to detect adjacent surface obstacles robustly to operate a cluster of unmanned surface vehicles.For this purpose,a LiDAR(light detection and ranging)sensor is used as it can simultaneously obtain 3D information for all directions,relatively robustly and accurately,irrespective of the surrounding environmental conditions.Although the GPS(global-positioning-system)error range exists,obtaining measurements of the surface-vessel position can still ensure stability during platoon maneuvering.In this study,a three-layer convolutional neural network is applied to classify types of surface vehicles.The aim of this approach is to redefine the sparse 3D point cloud data as 2D image data with a connotative meaning and subsequently utilize this transformed data for object classification purposes.Hence,we have proposed a descriptor that converts the 3D point cloud data into 2D image data.To use this descriptor effectively,it is necessary to perform a clustering operation that separates the point clouds for each object.We developed voxel-based clustering for the point cloud clustering.Furthermore,using the descriptor,3D point cloud data can be converted into a 2D feature image,and the converted 2D image is provided as an input value to the network.We intend to verify the validity of the proposed 3D point cloud feature descriptor by using experimental data in the simulator.Furthermore,we explore the feasibility of real-time object classification within this framework. 展开更多
关键词 object classification Clustering 3d point cloud data LidAR(light detection and ranging) Surface vehicle
下载PDF
Traffic Accident Detection Based on Deformable Frustum Proposal and Adaptive Space Segmentation
11
作者 Peng Chen Weiwei Zhang +1 位作者 Ziyao Xiao Yongxiang Tian 《Computer Modeling in Engineering & Sciences》 SCIE EI 2022年第1期97-109,共13页
Road accident detection plays an important role in abnormal scene reconstruction for Intelligent Transportation Systems and abnormal events warning for autonomous driving.This paper presents a novel 3D object detector... Road accident detection plays an important role in abnormal scene reconstruction for Intelligent Transportation Systems and abnormal events warning for autonomous driving.This paper presents a novel 3D object detector and adaptive space partitioning algorithm to infer traffic accidents quantitatively.Using 2D region proposals in an RGB image,this method generates deformable frustums based on point cloud for each 2D region proposal and then frustum-wisely extracts features based on the farthest point sampling network(FPS-Net)and feature extraction network(FE-Net).Subsequently,the encoder-decoder network(ED-Net)implements 3D-oriented bounding box(OBB)regression.Meanwhile,the adaptive least square regression(ALSR)method is proposed to split 3D OBB.Finally,the reduced OBB intersection test is carried out to detect traffic accidents via separating surface theorem(SST).In the experiments of KITTI benchmark,our proposed 3D object detector outperforms other state-of-theartmethods.Meanwhile,collision detection algorithm achieves the satisfactory performance of 91.8%accuracy on our SHTA dataset. 展开更多
关键词 Traffic accident detection 3d object detection deformable frustum proposal adaptive space segmentation
下载PDF
Researches on Cartographic Database-Based Interactive Three-Dimensional Topographic Map
12
作者 JiangWenping XiDaping 《Journal of China University of Geosciences》 SCIE CSCD 2003年第4期374-380,共7页
With the development of computer graphics, the three-dimensional (3D) visualization brings new technological revolution to the traditional cartography. Therefore, the topographic 3D-map emerges to adapt to this techno... With the development of computer graphics, the three-dimensional (3D) visualization brings new technological revolution to the traditional cartography. Therefore, the topographic 3D-map emerges to adapt to this technological revolution, and the applications of topographic 3D-map are spread rapidly to other relevant fields due to its incomparable advantage. The researches on digital map and the construction of map database offer strong technical support and abundant data source for this new technology, so the research and development of topographic 3D-map will receive greater concern. The basic data of the topographic 3D-map are rooted mainly in digital map and its basic model is derived from digital elevation model (DEM) and 3D-models of other DEM-based geographic features. In view of the potential enormous data and the complexity of geographic features, the dynamic representation of geographic information becomes the focus of the research of topographic 3D-map and also the prerequisite condition of 3D query and analysis. In addition to the equipment of hardware that are restraining, to a certain extent, the 3D representation, the data organization structure of geographic information will be the core problem of research on 3D-map. Level of detail (LOD), space partitioning, dynamic object loading (DOL) and object culling are core technologies of the dynamic 3D representation. The object- selection, attribute-query and model-editing are important functions and interaction tools for users with 3D-maps provided by topographic 3D-map system, all of which are based on the data structure of the 3D-model. This paper discusses the basic theories, concepts and cardinal principles of topographic 3D-map, expounds the basic way to organize the scene hierarchy of topographic 3D-map based on the node mechanism and studies the dynamic representation technologies of topographic 3D-map based on LOD, space partitioning, DOL and object culling. Moreover, such interactive operation functions are explored, in this paper, as spatial query, scene editing and management of topographic 3D-map. Finally, this paper describes briefly the applications of topographic 3D-map in its related fields. 展开更多
关键词 three-dimensional (3d) visualization topographic 3d-map level of detail (LOd) space partitioning dynamic object loading (dOL) dynamic representation.
下载PDF
3D Bounding Box Proposal for on-Street Parking Space Status Sensing in Real World Conditions 被引量:1
13
作者 Yaocheng Zheng Weiwei Zhang +1 位作者 Xuncheng Wu Bo Zhao 《Computer Modeling in Engineering & Sciences》 SCIE EI 2019年第6期559-576,共18页
Vision-based technologies have been extensively applied for on-street parking space sensing,aiming at providing timely and accurate information for drivers and improving daily travel convenience.However,it faces great... Vision-based technologies have been extensively applied for on-street parking space sensing,aiming at providing timely and accurate information for drivers and improving daily travel convenience.However,it faces great challenges as a partial visualization regularly occurs owing to occlusion from static or dynamic objects or a limited perspective of camera.This paper presents an imagery-based framework to infer parking space status by generating 3D bounding box of the vehicle.A specially designed convolutional neural network based on ResNet and feature pyramid network is proposed to overcome challenges from partial visualization and occlusion.It predicts 3D box candidates on multi-scale feature maps with five different 3D anchors,which generated by clustering diverse scales of ground truth box according to different vehicle templates in the source data set.Subsequently,vehicle distribution map is constructed jointly from the coordinates of vehicle box and artificially segmented parking spaces,where the normative degree of parked vehicle is calculated by computing the intersection over union between vehicle’s box and parking space edge.In space status inference,to further eliminate mutual vehicle interference,three adjacent spaces are combined into one unit and then a multinomial logistic regression model is trained to refine the status of the unit.Experiments on KITTI benchmark and Shanghai road show that the proposed method outperforms most monocular approaches in 3D box regression and achieves satisfactory accuracy in space status inference. 展开更多
关键词 3d object PROPOSAL image processing and analysis PARKING space detection fully convolutional network MULTINOMIAL LOGISTIC regression model
下载PDF
3D Depth Measurement for Holoscopic 3D Imaging System
14
作者 Eman Alazawi Mohammad Rafiq Swash Maysam Abbod 《Journal of Computer and Communications》 2016年第6期49-67,共19页
Holoscopic 3D imaging is a true 3D imaging system mimics fly’s eye technique to acquire a true 3D optical model of a real scene. To reconstruct the 3D image computationally, an efficient implementation of an Auto-Fea... Holoscopic 3D imaging is a true 3D imaging system mimics fly’s eye technique to acquire a true 3D optical model of a real scene. To reconstruct the 3D image computationally, an efficient implementation of an Auto-Feature-Edge (AFE) descriptor algorithm is required that provides an individual feature detector for integration of 3D information to locate objects in the scene. The AFE descriptor plays a key role in simplifying the detection of both edge-based and region-based objects. The detector is based on a Multi-Quantize Adaptive Local Histogram Analysis (MQALHA) algorithm. This is distinctive for each Feature-Edge (FE) block i.e. the large contrast changes (gradients) in FE are easier to localise. The novelty of this work lies in generating a free-noise 3D-Map (3DM) according to a correlation analysis of region contours. This automatically combines the exploitation of the available depth estimation technique with edge-based feature shape recognition technique. The application area consists of two varied domains, which prove the efficiency and robustness of the approach: a) extracting a set of setting feature-edges, for both tracking and mapping process for 3D depthmap estimation, and b) separation and recognition of focus objects in the scene. Experimental results show that the proposed 3DM technique is performed efficiently compared to the state-of-the-art algorithms. 展开更多
关键词 Holoscopic 3d Image Edge detection Auto-Thresholding depthmap Integral Image Local Histogram Analysis object Recognition and depth Measurement
下载PDF
ARM3D:Attention-based relation module for indoor 3D object detection 被引量:4
15
作者 Yuqing Lan Yao Duan +4 位作者 Chenyi Liu Chenyang Zhu Yueshan Xiong Hui Huang Kai Xu 《Computational Visual Media》 SCIE EI CSCD 2022年第3期395-414,共20页
Relation contexts have been proved to be useful for many challenging vision tasks.In the field of3D object detection,previous methods have been taking the advantage of context encoding,graph embedding,or explicit rela... Relation contexts have been proved to be useful for many challenging vision tasks.In the field of3D object detection,previous methods have been taking the advantage of context encoding,graph embedding,or explicit relation reasoning to extract relation contexts.However,there exist inevitably redundant relation contexts due to noisy or low-quality proposals.In fact,invalid relation contexts usually indicate underlying scene misunderstanding and ambiguity,which may,on the contrary,reduce the performance in complex scenes.Inspired by recent attention mechanism like Transformer,we propose a novel 3D attention-based relation module(ARM3D).It encompasses objectaware relation reasoning to extract pair-wise relation contexts among qualified proposals and an attention module to distribute attention weights towards different relation contexts.In this way,ARM3D can take full advantage of the useful relation contexts and filter those less relevant or even confusing contexts,which mitigates the ambiguity in detection.We have evaluated the effectiveness of ARM3D by plugging it into several state-of-the-art 3D object detectors and showing more accurate and robust detection results.Extensive experiments show the capability and generalization of ARM3D on 3D object detection.Our source code is available at https://github.com/lanlan96/ARM3D. 展开更多
关键词 attention mechanism scene understanding relational reasoning 3d indoor object detection
原文传递
3D Object Detection Based on Vanishing Point and Prior Orientation 被引量:2
16
作者 GAO Yongbin ZHAO Huaqing +2 位作者 FANG Zhijun HUANG Bo ZHONG Cengsi 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2019年第5期369-375,共7页
3D object detection is one of the most challenging research tasks in computer vision. In order to solve the problem of template information dependency of 3D object proposal in the method of 3D object detection based o... 3D object detection is one of the most challenging research tasks in computer vision. In order to solve the problem of template information dependency of 3D object proposal in the method of 3D object detection based on 2.5D information, we proposed a 3D object detector based on fusion of vanishing point and prior orientation, which estimates an accurate 3D proposal from 2.5D data, and provides an excellent start point for 3D object classification and localization. The algorithm first calculates three mutually orthogonal vanishing points by the Euler angle principle and projects them into the pixel coordinate system. Then, the top edge of the 2D proposal is sampled by the preset sampling pitch, and the first one vertex is taken. Finally, the remaining seven vertices of the 3D proposal are calculated according to the linear relationship between the three vanishing points and the vertices, and the complete information of the 3D proposal is obtained. The experimental results show that this proposed method improves the Mean Average Precision score by 2.7% based on the Amodal3Det method. 展开更多
关键词 image analysis 3d object detection prior ORIENTATION VANISHING point EULER ANGLE
原文传递
RGB Image‑ and Lidar‑Based 3D Object Detection Under Multiple Lighting Scenarios 被引量:1
17
作者 Wentao Chen Wei Tian +1 位作者 Xiang Xie Wilhelm Stork 《Automotive Innovation》 EI CSCD 2022年第3期251-259,共9页
In recent years,camera-and lidar-based 3D object detection has achieved great progress.However,the related researches mainly focus on normal illumination conditions;the performance of their 3D detection algorithms wil... In recent years,camera-and lidar-based 3D object detection has achieved great progress.However,the related researches mainly focus on normal illumination conditions;the performance of their 3D detection algorithms will decrease under low lighting scenarios such as in the night.This work attempts to improve the fusion strategies on 3D vehicle detection accuracy in multiple lighting conditions.First,distance and uncertainty information is incorporated to guide the“painting”of semantic information onto point cloud during the data preprocessing.Moreover,a multitask framework is designed,which incorpo-rates uncertainty learning to improve detection accuracy under low-illumination scenarios.In the validation on KITTI and Dark-KITTI benchmark,the proposed method increases the vehicle detection accuracy on the KITTI benchmark by 1.35%and the generality of the model is validated on the proposed Dark-KITTI dataset,with a gain of 0.64%for vehicle detection. 展开更多
关键词 3d object detection Multi-sensor fusion Uncertainty estimation Semantic segmentation PointPainting
原文传递
3D Object Detection Incorporating Instance Segmentation and Image Restoration
18
作者 HUANG Bo HUANG Man +3 位作者 GAO Yongbin YU Yuxin JIANG Xiaoyan ZHANG Juan 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2019年第4期360-368,共9页
Nowadays, 3D object detection, which uses the color and depth information to find object localization in the 3D world and estimate their physical size and pose, is one of the most important 3D perception tasks in the ... Nowadays, 3D object detection, which uses the color and depth information to find object localization in the 3D world and estimate their physical size and pose, is one of the most important 3D perception tasks in the field of computer vision. In order to solve the problem of mixed segmentation results when multiple instances appear in one frustum in the F-PointNet method and in the occlusion that leads to the loss of depth information, a 3D object detection approach based on instance segmentation and image restoration is proposed in this paper. Firstly, instance segmentation with Mask R-CNN on an RGB image is used to avoid mixed segmentation results. Secondly, for the detected occluded objects, we remove the occluding object first in the depth map and then restore the empty pixel region by utilizing the Criminisi Algorithm to recover the missing depth information of the object. The experimental results show that the proposed method improves the average precision score compared with the F-PointNet method. 展开更多
关键词 IMAGE processing 3d object detection instance SEGMENTATION dEPTH information IMAGE RESTORATION
原文传递
PointGAT: Graph attention networks for 3D object detection
19
作者 Haoran Zhou Wei Wang +1 位作者 Gang Liu Qingguo Zhou 《Intelligent and Converged Networks》 EI 2022年第2期204-216,共13页
3D object detection is a critical technology in many applications,and among the various detection methods,pointcloud-based methods have been the most popular research topic in recent years.Since Graph Neural Network(G... 3D object detection is a critical technology in many applications,and among the various detection methods,pointcloud-based methods have been the most popular research topic in recent years.Since Graph Neural Network(GNN)is considered to be effective in dealing with pointclouds,in this work,we combined it with the attention mechanism and proposed a 3D object detection method named PointGAT.Our proposed PointGAT outperforms previous approaches on the KITTI test dataset.Experiments in real campus scenarios also demonstrate the potential of our method for further applications. 展开更多
关键词 3d object detection pointcloud graph neural network attention mechanism
原文传递
LWD-3D:Lightweight Detector Based on Self-Attention for 3D Object Detection
20
作者 Shuo Yang Huimin Lu +2 位作者 Tohru Kamiya Yoshihisa Nakatoh Seiichi Serikawa 《CAAI Artificial Intelligence Research》 2022年第2期137-143,共7页
Lightweight modules play a key role in 3D object detection tasks for autonomous driving,which are necessary for the application of 3D object detectors.At present,research still focuses on constructing complex models a... Lightweight modules play a key role in 3D object detection tasks for autonomous driving,which are necessary for the application of 3D object detectors.At present,research still focuses on constructing complex models and calculations to improve the detection precision at the expense of the running rate.However,building a lightweight model to learn the global features from point cloud data for 3D object detection is a significant problem.In this paper,we focus on combining convolutional neural networks with selfattention-based vision transformers to realize lightweight and high-speed computing for 3D object detection.We propose lightweight detection 3D(LWD-3D),which is a point cloud conversion and lightweight vision transformer for autonomous driving.LWD-3D utilizes a one-shot regression framework in 2D space and generates a 3D object bounding box from point cloud data,which provides a new feature representation method based on a vision transformer for 3D detection applications.The results of experiment on the KITTI 3D dataset show that LWD-3D achieves real-time detection(time per image<20 ms).LWD-3D obtains a mean average precision(mAP)75%higher than that of another 3D real-time detector with half the number of parameters.Our research extends the application of visual transformers to 3D object detection tasks. 展开更多
关键词 3d object detection point clouds vision transformer one-shot regression real-time
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部