期刊文献+
共找到818篇文章
< 1 2 41 >
每页显示 20 50 100
Depth-Guided Vision Transformer With Normalizing Flows for Monocular 3D Object Detection
1
作者 Cong Pan Junran Peng Zhaoxiang Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第3期673-689,共17页
Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input t... Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts. 展开更多
关键词 Monocular 3d object detection normalizing flows Swin Transformer
下载PDF
Algorithm and System of Scanning Color 3D Objects 被引量:1
2
作者 许智钦 孙长库 郑义忠 《Transactions of Tianjin University》 EI CAS 2002年第2期134-138,共5页
This paper presents a complete system for scanning the geometry and texture of a large 3D object, then the automatic registration is performed to obtain a whole realistic 3D model. This system is composed of one line ... This paper presents a complete system for scanning the geometry and texture of a large 3D object, then the automatic registration is performed to obtain a whole realistic 3D model. This system is composed of one line strip laser and one color CCD camera. The scanned object is pictured twice by a color CCD camera. First, the texture of the scanned object is taken by a color CCD camera. Then the 3D information of the scanned object is obtained from laser plane equations. This paper presents a practical way to implement the three dimensional measuring method and the automatic registration of a large 3D object and a pretty good result is obtained after experiment verification. 展开更多
关键词 D measurement color 3d object laser scanning surface construction
下载PDF
General and robust voxel feature learning with Transformer for 3D object detection 被引量:1
3
作者 LI Yang GE Hongwei 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2022年第1期51-60,共10页
The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.I... The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.Inspired by the great progress of Transformer,we propose a novel general and robust voxel feature encoder for 3D object detection based on the traditional Transformer.We first investigate the permutation invariance of sequence data of the self-attention and apply it to point cloud processing.Then we construct a voxel feature layer based on the self-attention to adaptively learn local and robust context of a voxel according to the spatial relationship and context information exchanging between all points within the voxel.Lastly,we construct a general voxel feature learning framework with the voxel feature layer as the core for 3D object detection.The voxel feature with Transformer(VFT)can be plugged into any other voxel-based 3D object detection framework easily,and serves as the backbone for voxel feature extractor.Experiments results on the KITTI dataset demonstrate that our method achieves the state-of-the-art performance on 3D object detection. 展开更多
关键词 3d object detection self-attention networks voxel feature with Transformer(VFT) point cloud encoder-decoder
下载PDF
Exploring Local Regularities for 3D Object Recognition
4
作者 TIAN Huaiwen QIN Shengfeng 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2016年第6期1104-1113,共10页
In order to find better simplicity measurements for 3D object recognition, a new set of local regularities is developed and tested in a stepwise 3D reconstruction method, including localized minimizing standard deviat... In order to find better simplicity measurements for 3D object recognition, a new set of local regularities is developed and tested in a stepwise 3D reconstruction method, including localized minimizing standard deviation of angles(L-MSDA), localized minimizing standard deviation of segment magnitudes(L-MSDSM), localized minimum standard deviation of areas of child faces (L-MSDAF), localized minimum sum of segment magnitudes of common edges (L-MSSM), and localized minimum sum of areas of child face (L-MSAF). Based on their effectiveness measurements in terms of form and size distortions, it is found that when two local regularities: L-MSDA and L-MSDSM are combined together, they can produce better performance. In addition, the best weightings for them to work together are identified as 10% for L-MSDSM and 90% for L-MSDA. The test results show that the combined usage of L-MSDA and L-MSDSM with identified weightings has a potential to be applied in other optimization based 3D recognition methods to improve their efficacy and robustness. 展开更多
关键词 stepwise 3d reconstruction localized regularities 3d object recognition polyhedral objects line drawing
下载PDF
MFF-Net: Multimodal Feature Fusion Network for 3D Object Detection
5
作者 Peicheng Shi Zhiqiang Liu +1 位作者 Heng Qi Aixi Yang 《Computers, Materials & Continua》 SCIE EI 2023年第6期5615-5637,共23页
In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection ... In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection will be affected by problems such as illumination changes,object occlusion,and object detection distance.To this purpose,we face these challenges by proposing a multimodal feature fusion network for 3D object detection(MFF-Net).In this research,this paper first uses the spatial transformation projection algorithm to map the image features into the feature space,so that the image features are in the same spatial dimension when fused with the point cloud features.Then,feature channel weighting is performed using an adaptive expression augmentation fusion network to enhance important network features,suppress useless features,and increase the directionality of the network to features.Finally,this paper increases the probability of false detection and missed detection in the non-maximum suppression algo-rithm by increasing the one-dimensional threshold.So far,this paper has constructed a complete 3D target detection network based on multimodal feature fusion.The experimental results show that the proposed achieves an average accuracy of 82.60%on the Karlsruhe Institute of Technology and Toyota Technological Institute(KITTI)dataset,outperforming previous state-of-the-art multimodal fusion networks.In Easy,Moderate,and hard evaluation indicators,the accuracy rate of this paper reaches 90.96%,81.46%,and 75.39%.This shows that the MFF-Net network has good performance in 3D object detection. 展开更多
关键词 3d object detection multimodal fusion neural network autonomous driving attention mechanism
下载PDF
Monocular 3D object detection with Pseudo-LiDAR confidence sampling and hierarchical geometric feature extraction in 6G network
6
作者 Jianlong Zhang Guangzu Fang +3 位作者 Bin Wang Xiaobo Zhou Qingqi Pei Chen Chen 《Digital Communications and Networks》 SCIE CSCD 2023年第4期827-835,共9页
The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpow... The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpower solution compared to LiDAR solutions in the field of autonomous driving.However,this technique has some problems,i.e.,(1)the poor quality of generated Pseudo-LiDAR point clouds resulting from the nonlinear error distribution of monocular depth estimation and(2)the weak representation capability of point cloud features due to the neglected global geometric structure features of point clouds existing in LiDAR-based 3D detection networks.Therefore,we proposed a Pseudo-LiDAR confidence sampling strategy and a hierarchical geometric feature extraction module for monocular 3D object detection.We first designed a point cloud confidence sampling strategy based on a 3D Gaussian distribution to assign small confidence to the points with great error in depth estimation and filter them out according to the confidence.Then,we present a hierarchical geometric feature extraction module by aggregating the local neighborhood features and a dual transformer to capture the global geometric features in the point cloud.Finally,our detection framework is based on Point-Voxel-RCNN(PV-RCNN)with high-quality Pseudo-LiDAR and enriched geometric features as input.From the experimental results,our method achieves satisfactory results in monocular 3D object detection. 展开更多
关键词 Monocular 3d object detection Pseudo-LiDAR Confidence sampling Hierarchical geometric feature extraction
下载PDF
3D Object Detection with Attention:Shell-Based Modeling
7
作者 Xiaorui Zhang Ziquan Zhao +1 位作者 Wei Sun Qi Cui 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期537-550,共14页
LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previou... LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previous object detection methods,due to the pre-processing of the original LIDAR point cloud into voxels or pillars,lose the coordinate information of the original point cloud,slow detection speed,and gain inaccurate bounding box positioning.To address the issues above,this study proposes a new two-stage network structure to extract point cloud features directly by PointNet++,which effectively preserves the original point cloud coordinate information.To improve the detection accuracy,a shell-based modeling method is proposed.It roughly determines which spherical shell the coordinates belong to.Then,the results are refined to ground truth,thereby narrowing the localization range and improving the detection accuracy.To improve the recall of 3D object detection with bounding boxes,this paper designs a self-attention module for 3D object detection with a skip connection structure.Some of these features are highlighted by weighting them on the feature dimensions.After training,it makes the feature weights that are favorable for object detection get larger.Thus,the extracted features are more adapted to the object detection task.Extensive comparison experiments and ablation experiments conducted on the KITTI dataset verify the effectiveness of our proposed method in improving recall and precision. 展开更多
关键词 3d object detection autonomous driving point cloud shell-based modeling self-attention mechanism
下载PDF
3D Object Recognition by Classification Using Neural Networks
8
作者 Mostafa Elhachloufi Ahmed El Oirrak +1 位作者 Aboutajdine Driss M. Najib Kaddioui Mohamed 《Journal of Software Engineering and Applications》 2011年第5期306-310,共5页
In this Paper, a classification method based on neural networks is presented for recognition of 3D objects. Indeed, the objective of this paper is to classify an object query against objects in a database, which leads... In this Paper, a classification method based on neural networks is presented for recognition of 3D objects. Indeed, the objective of this paper is to classify an object query against objects in a database, which leads to recognition of the former. 3D objects of this database are transformations of other objects by one element of the overall transformation. The set of transformations considered in this work is the general affine group. 展开更多
关键词 RECOGNITION CLASSIFICATION 3d object NEURAL Network AFFINE TRANSFORMATION
下载PDF
Image attention transformer network for indoor 3D object detection
9
作者 REN KeYan YAN Tong +2 位作者 HU ZhaoXin HAN HongGui ZHANG YunLu 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2024年第7期2176-2190,共15页
Point clouds and RGB images are both critical data for 3D object detection. While recent multi-modal methods combine them directly and show remarkable performances, they ignore the distinct forms of these two types of... Point clouds and RGB images are both critical data for 3D object detection. While recent multi-modal methods combine them directly and show remarkable performances, they ignore the distinct forms of these two types of data. For mitigating the influence of this intrinsic difference on performance, we propose a novel but effective fusion model named LI-Attention model, which takes both RGB features and point cloud features into consideration and assigns a weight to each RGB feature by attention mechanism.Furthermore, based on the LI-Attention model, we propose a 3D object detection method called image attention transformer network(IAT-Net) specialized for indoor RGB-D scene. Compared with previous work on multi-modal detection, IAT-Net fuses elaborate RGB features from 2D detection results with point cloud features in attention mechanism, meanwhile generates and refines 3D detection results with transformer model. Extensive experiments demonstrate that our approach outperforms stateof-the-art performance on two widely used benchmarks of indoor 3D object detection, SUN RGB-D and NYU Depth V2, while ablation studies have been provided to analyze the effect of each module. And the source code for the proposed IAT-Net is publicly available at https://github.com/wisper181/IAT-Net. 展开更多
关键词 3d object detection TRANSFORMER attention mechanism
原文传递
Rail-Pillar Net:A 3D Detection Network for Railway Foreign Object Based on LiDAR
10
作者 Fan Li Shuyao Zhang +2 位作者 Jie Yang Zhicheng Feng Zhichao Chen 《Computers, Materials & Continua》 SCIE EI 2024年第9期3819-3833,共15页
Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,w... Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,we propose Rail-PillarNet,a three-dimensional(3D)LIDAR(Light Detection and Ranging)railway foreign object detection method based on the improvement of PointPillars.Firstly,the parallel attention pillar encoder(PAPE)is designed to fully extract the features of the pillars and alleviate the problem of local fine-grained information loss in PointPillars pillars encoder.Secondly,a fine backbone network is designed to improve the feature extraction capability of the network by combining the coding characteristics of LIDAR point cloud feature and residual structure.Finally,the initial weight parameters of the model were optimised by the transfer learning training method to further improve accuracy.The experimental results on the OSDaR23 dataset show that the average accuracy of Rail-PillarNet reaches 58.51%,which is higher than most mainstream models,and the number of parameters is 5.49 M.Compared with PointPillars,the accuracy of each target is improved by 10.94%,3.53%,16.96%and 19.90%,respectively,and the number of parameters only increases by 0.64M,which achieves a balance between the number of parameters and accuracy. 展开更多
关键词 Railway foreign object light detection and ranging(LiDAR) 3d object detection PointPillars parallel attention mechanism transfer learning
下载PDF
Visualizing perceived spatial data quality of 3D objects within virtual globes 被引量:1
11
作者 Krista Jones Rodolphe Devillers +1 位作者 Yvan Bedard Olaf Schroth 《International Journal of Digital Earth》 SCIE EI 2014年第10期771-788,共18页
Virtual globes(VGs)allow Internet users to view geographic data of heterogeneous quality created by other users.This article presents a new approach for collecting and visualizing information about the perceived quali... Virtual globes(VGs)allow Internet users to view geographic data of heterogeneous quality created by other users.This article presents a new approach for collecting and visualizing information about the perceived quality of 3D data in VGs.It aims atimproving users’awareness of the qualityof 3D objects.Instead of relying onthe existing metadata or on formal accuracy assessments that are often impossible in practice,we propose a crowd-sourced quality recommender system based on the five-star visualization method successful in other types of Web applications.Four alternative five-star visualizations were implemented in a Google Earth-based prototype and tested through a formal user evaluation.These tests helped identifying the most effective method for a 3D environment.Results indicate that while most websites use a visualization approach that shows a‘number of stars’,this method was the least preferred by participants.Instead,participants ranked the‘number within a star’method highest as it allowed reducing the visual clutter in urban settings,suggesting that 3D environments such as VGs require different designapproachesthan2Dornon-geographicapplications.Resultsalsoconfirmed that expert and non-expert users in geographic data share similar preferences for the most and least preferred visualization methods. 展开更多
关键词 virtual globes spatial data quality UNCERTAINTY quality recommender system five-star 3d objects
原文传递
RGB Image‑ and Lidar‑Based 3D Object Detection Under Multiple Lighting Scenarios 被引量:1
12
作者 Wentao Chen Wei Tian +1 位作者 Xiang Xie Wilhelm Stork 《Automotive Innovation》 EI CSCD 2022年第3期251-259,共9页
In recent years,camera-and lidar-based 3D object detection has achieved great progress.However,the related researches mainly focus on normal illumination conditions;the performance of their 3D detection algorithms wil... In recent years,camera-and lidar-based 3D object detection has achieved great progress.However,the related researches mainly focus on normal illumination conditions;the performance of their 3D detection algorithms will decrease under low lighting scenarios such as in the night.This work attempts to improve the fusion strategies on 3D vehicle detection accuracy in multiple lighting conditions.First,distance and uncertainty information is incorporated to guide the“painting”of semantic information onto point cloud during the data preprocessing.Moreover,a multitask framework is designed,which incorpo-rates uncertainty learning to improve detection accuracy under low-illumination scenarios.In the validation on KITTI and Dark-KITTI benchmark,the proposed method increases the vehicle detection accuracy on the KITTI benchmark by 1.35%and the generality of the model is validated on the proposed Dark-KITTI dataset,with a gain of 0.64%for vehicle detection. 展开更多
关键词 3d object detection Multi-sensor fusion Uncertainty estimation Semantic segmentation PointPainting
原文传递
PointGAT: Graph attention networks for 3D object detection
13
作者 Haoran Zhou Wei Wang +1 位作者 Gang Liu Qingguo Zhou 《Intelligent and Converged Networks》 EI 2022年第2期204-216,共13页
3D object detection is a critical technology in many applications,and among the various detection methods,pointcloud-based methods have been the most popular research topic in recent years.Since Graph Neural Network(G... 3D object detection is a critical technology in many applications,and among the various detection methods,pointcloud-based methods have been the most popular research topic in recent years.Since Graph Neural Network(GNN)is considered to be effective in dealing with pointclouds,in this work,we combined it with the attention mechanism and proposed a 3D object detection method named PointGAT.Our proposed PointGAT outperforms previous approaches on the KITTI test dataset.Experiments in real campus scenarios also demonstrate the potential of our method for further applications. 展开更多
关键词 3d object detection pointcloud graph neural network attention mechanism
原文传递
沉浸式3D虚拟仿真实验平台构建 被引量:2
14
作者 李亚南 李聪聪 +1 位作者 马丽 任力生 《实验室研究与探索》 CAS 北大核心 2024年第6期201-208,共8页
为解决传统实践教学在时间、空间上的局限性,增强实践教学的互动性,以虚拟智慧城市中的物联网创新应用为研究对象,结合虚拟现实技术,在实验内容和教学模式中融入价值创造和创业素养,构建融入“创新实验路径”和“多层次综合实验项目”... 为解决传统实践教学在时间、空间上的局限性,增强实践教学的互动性,以虚拟智慧城市中的物联网创新应用为研究对象,结合虚拟现实技术,在实验内容和教学模式中融入价值创造和创业素养,构建融入“创新实验路径”和“多层次综合实验项目”的物联网专业沉浸式3D虚拟仿真实验平台。采用布鲁姆教学目标分类法设计3D虚拟仿真实验教学目标,并按照IAPVE的实施模型,构建基于3D虚拟仿真实验平台的教学实施模型和考核评价模型。实施结果表明,该3D虚拟仿真实验平台及教学实施和考核评价模型可指导实践教学改革,实现学生综合能力的全面协同提升。 展开更多
关键词 3d虚拟仿真 布鲁姆教学目标 教学实施模型 考核评价模型
下载PDF
LWD-3D:Lightweight Detector Based on Self-Attention for 3D Object Detection
15
作者 Shuo Yang Huimin Lu +2 位作者 Tohru Kamiya Yoshihisa Nakatoh Seiichi Serikawa 《CAAI Artificial Intelligence Research》 2022年第2期137-143,共7页
Lightweight modules play a key role in 3D object detection tasks for autonomous driving,which are necessary for the application of 3D object detectors.At present,research still focuses on constructing complex models a... Lightweight modules play a key role in 3D object detection tasks for autonomous driving,which are necessary for the application of 3D object detectors.At present,research still focuses on constructing complex models and calculations to improve the detection precision at the expense of the running rate.However,building a lightweight model to learn the global features from point cloud data for 3D object detection is a significant problem.In this paper,we focus on combining convolutional neural networks with selfattention-based vision transformers to realize lightweight and high-speed computing for 3D object detection.We propose lightweight detection 3D(LWD-3D),which is a point cloud conversion and lightweight vision transformer for autonomous driving.LWD-3D utilizes a one-shot regression framework in 2D space and generates a 3D object bounding box from point cloud data,which provides a new feature representation method based on a vision transformer for 3D detection applications.The results of experiment on the KITTI 3D dataset show that LWD-3D achieves real-time detection(time per image<20 ms).LWD-3D obtains a mean average precision(mAP)75%higher than that of another 3D real-time detector with half the number of parameters.Our research extends the application of visual transformers to 3D object detection tasks. 展开更多
关键词 3d object detection point clouds vision transformer one-shot regression real-time
原文传递
基于深度与实例分割融合的单目3D目标检测方法
16
作者 孙逊 冯睿锋 陈彦如 《计算机应用》 CSCD 北大核心 2024年第7期2208-2215,共8页
针对单目3D目标检测在视角变化引起的物体大小变化以及物体遮挡等情况下效果不佳的问题,提出一种融合深度信息和实例分割掩码的新型单目3D目标检测方法。首先,通过深度-掩码注意力融合(DMAF)模块,将深度信息与实例分割掩码结合,以提供... 针对单目3D目标检测在视角变化引起的物体大小变化以及物体遮挡等情况下效果不佳的问题,提出一种融合深度信息和实例分割掩码的新型单目3D目标检测方法。首先,通过深度-掩码注意力融合(DMAF)模块,将深度信息与实例分割掩码结合,以提供更准确的物体边界;其次,引入动态卷积,并利用DMAF模块得到的融合特征引导动态卷积核的生成,以处理不同尺度的物体;再次,在损失函数中引入2D-3D边界框一致性损失函数,调整预测的3D边界框与对应的2D检测框高度一致,以提高实例分割和3D目标检测任务的效果;最后,通过消融实验验证该方法的有效性,并在KITTI测试集上对该方法进行验证。实验结果表明,与仅使用深度估计图和实例分割掩码的方法相比,在中等难度下对车辆类别检测的平均精度提高了6.36个百分点,且3D目标检测和鸟瞰图目标检测任务的效果均优于D4LCN(Depth-guided Dynamic-Depthwise-Dilated Local Convolutional Network)、M3D-RPN(Monocular 3D Region Proposal Network)等对比方法。 展开更多
关键词 单目3d目标检测 深度学习 动态卷积 实例分割
下载PDF
基于Contextual Transformer的自动驾驶单目3D目标检测
17
作者 厍向阳 颜唯佳 董立红 《计算机工程与应用》 CSCD 北大核心 2024年第19期178-189,共12页
针对当前单目3D目标检测中存在的漏检和多尺度目标检测效果不佳的问题,提出了一种基于Contextual Transformer的自动驾驶单目3D目标检测算法(CM-RTM3D)。在ResNet-50网络中引入Contextual Transformer(CoT),构建ResNet-Transformer架构... 针对当前单目3D目标检测中存在的漏检和多尺度目标检测效果不佳的问题,提出了一种基于Contextual Transformer的自动驾驶单目3D目标检测算法(CM-RTM3D)。在ResNet-50网络中引入Contextual Transformer(CoT),构建ResNet-Transformer架构以提取特征。设计多尺度空间感知模块(MSP),通过尺度空间响应操作改善浅层特征的丢失情况,嵌入沿水平和竖直两个空间方向的坐标注意力机制(CA),使用softmax函数生成各尺度的重要性软权重。在偏移损失中采用Huber损失函数代替L1损失函数。实验结果表明:在KITTI自动驾驶数据集上,相较于RTM3D算法,该算法在简单、中等、困难三个难度级别下,AP3D分别提升了4.84、3.82、5.36个百分点,APBEV分别提升了4.75、6.26、3.56个百分点。 展开更多
关键词 自动驾驶 单目3d目标检测 Contextual Transformer 多尺度感知 坐标注意力机制
下载PDF
LiDar点云指导下特征分布趋同与语义关联的3D目标检测
18
作者 郑锦 蒋博韬 +1 位作者 彭微 王森 《电子学报》 EI CAS CSCD 北大核心 2024年第5期1700-1715,共16页
针对现有基于伪点云的3D目标检测算法精度远低于基于真实激光雷达(Light Detection and ranging,LiDar)点云的3D目标检测,本文研究伪点云重构,并提出适合伪点云的3D目标检测网络.考虑到由图像深度转换得到的伪点云稠密且随深度增大逐渐... 针对现有基于伪点云的3D目标检测算法精度远低于基于真实激光雷达(Light Detection and ranging,LiDar)点云的3D目标检测,本文研究伪点云重构,并提出适合伪点云的3D目标检测网络.考虑到由图像深度转换得到的伪点云稠密且随深度增大逐渐稀疏,本文提出深度相关伪点云稀疏化方法,在减少后续计算量的同时保留中远距离更多的有效伪点云,实现伪点云重构.本文提出LiDar点云指导下特征分布趋同与语义关联的3D目标检测网络,在网络训练时引入LiDar点云分支来指导伪点云目标特征的生成,使生成的伪点云特征分布趋同于LiDar点云特征分布,从而降低数据源不一致造成的检测性能损失;针对RPN(Region Proposal Network)网络获取的3D候选框内的伪点云间语义关联不足的问题,设计注意力感知模块,在伪点云特征表示中通过注意力机制嵌入点间的语义关联关系,提升3D目标检测精度.在KITTI 3D目标检测数据集上的实验结果表明:现有的3D目标检测网络采用重构后的伪点云,检测精度提升了2.61%;提出的特征分布趋同与语义关联的3D目标检测网络,将基于伪点云的3D目标检测精度再提升0.57%,相比其他优秀的3D目标检测方法在检测精度上也有提升. 展开更多
关键词 3d目标检测 伪点云 语义关联 分布趋同 注意力感知
下载PDF
基于多分支特征融合的车载激光雷达3D目标检测算法
19
作者 金伟正 孙原 李方玉 《实验技术与管理》 CAS 北大核心 2024年第1期37-43,共7页
该文基于多分支特征融合的3D目标检测算法将无序的点云划分为规则的体素,利用体素特征编码模块和卷积神经网络学习体素特征,再将稀疏的3D数据压缩为稠密的二维鸟瞰图,最后通过2D骨干网络的粗糙分支和精细分支对多尺度鸟瞰图特征进行深... 该文基于多分支特征融合的3D目标检测算法将无序的点云划分为规则的体素,利用体素特征编码模块和卷积神经网络学习体素特征,再将稀疏的3D数据压缩为稠密的二维鸟瞰图,最后通过2D骨干网络的粗糙分支和精细分支对多尺度鸟瞰图特征进行深度融合。该文实现了对多尺度特征的语义信息、纹理信息和上下文信息的聚合,得到了更加精确的原始空间位置信息、物体分类、位置回归和朝向预测,在KITTI数据集上取得优异的平均精度,并在保持一定帧率的同时具有较强的稳健性。 展开更多
关键词 激光雷达点云 3d目标检测 感受域 特征融合
下载PDF
基于点云稀疏空间特征聚合激励的单阶段3D目标检测模型
20
作者 鲁斌 孙洋 杨振宇 《计算机辅助设计与图形学学报》 EI CSCD 北大核心 2024年第5期721-733,共13页
针对目前基于点云的3D目标检测中单阶段体素法存在感受野固定、特征尺度单一,导致模型对点云特征学习不够充分、模型检测效果存在瓶颈等问题,提出了一种可端对端训练的基于体素的单阶段3D目标检测模型.首先,利用多尺度稀疏空间特征聚合... 针对目前基于点云的3D目标检测中单阶段体素法存在感受野固定、特征尺度单一,导致模型对点云特征学习不够充分、模型检测效果存在瓶颈等问题,提出了一种可端对端训练的基于体素的单阶段3D目标检测模型.首先,利用多尺度稀疏空间特征聚合模块,聚合点云在不同稀疏空间尺度上的特征,使特征充分保留点云的空间信息;然后,对特征进行分层激励,通过多尺度感受野对特征进行分层学习,强化特征的表达能力,降低噪声信息对检测结果的影响;最后,将特征输入检测头进行候选框的分类和回归.在公开的自动驾驶数据集KITTI上与主流单阶段3D目标检测模型进行了对比实验,包含对3类目标共9个的难度等级目标的检测.所提模型在其中5个等级中的平均准确率有明显提升,尤其对点云稀疏的目标,表现出较好的检测效果.实验结果表明,所提模型能够充分提取点云空间信息并有效地学习点云多尺度特征. 展开更多
关键词 3d目标检测 激光雷达点云 多尺度稀疏空间特征聚合 分层激励
下载PDF
上一页 1 2 41 下一页 到第
使用帮助 返回顶部