In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and...In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and inherently sparse.Therefore,it is very difficult to extract long-range contexts and effectively aggregate local features for semantic segmentation in 3D point cloud space.Most current methods either focus on local feature aggregation or long-range context dependency,but fail to directly establish a global-local feature extractor to complete the point cloud semantic segmentation tasks.In this paper,we propose a Transformer-based stratified graph convolutional network(SGT-Net),which enlarges the effective receptive field and builds direct long-range dependency.Specifically,we first propose a novel dense-sparse sampling strategy that provides dense local vertices and sparse long-distance vertices for subsequent graph convolutional network(GCN).Secondly,we propose a multi-key self-attention mechanism based on the Transformer to further weight augmentation for crucial neighboring relationships and enlarge the effective receptive field.In addition,to further improve the efficiency of the network,we propose a similarity measurement module to determine whether the neighborhood near the center point is effective.We demonstrate the validity and superiority of our method on the S3DIS and ShapeNet datasets.Through ablation experiments and segmentation visualization,we verify that the SGT model can improve the performance of the point cloud semantic segmentation.展开更多
This paper presents a method for hand gesture recognition based on 3D point cloud. Digital image processing technology is used in this research. Based on the 3D point from depth camera, the system firstly extracts som...This paper presents a method for hand gesture recognition based on 3D point cloud. Digital image processing technology is used in this research. Based on the 3D point from depth camera, the system firstly extracts some raw data of the hand. After the data segmentation and preprocessing, three kinds of appearance features are extracted, including the number of stretched fingers, the angles between fingers and the gesture region’s area distribution feature. Based on these features, the system implements the identification of the gestures by using decision tree method. The results of experiment demonstrate that the proposed method is pretty efficient to recognize common gestures with a high accuracy.展开更多
This paper focuses on the effective utilization of data augmentation techniques for 3Dlidar point clouds to enhance the performance of neural network models.These point clouds,which represent spatial information throu...This paper focuses on the effective utilization of data augmentation techniques for 3Dlidar point clouds to enhance the performance of neural network models.These point clouds,which represent spatial information through a collection of 3D coordinates,have found wide-ranging applications.Data augmentation has emerged as a potent solution to the challenges posed by limited labeled data and the need to enhance model generalization capabilities.Much of the existing research is devoted to crafting novel data augmentation methods specifically for 3D lidar point clouds.However,there has been a lack of focus on making the most of the numerous existing augmentation techniques.Addressing this deficiency,this research investigates the possibility of combining two fundamental data augmentation strategies.The paper introduces PolarMix andMix3D,two commonly employed augmentation techniques,and presents a new approach,named RandomFusion.Instead of using a fixed or predetermined combination of augmentation methods,RandomFusion randomly chooses one method from a pool of options for each instance or sample.This innovative data augmentation technique randomly augments each point in the point cloud with either PolarMix or Mix3D.The crux of this strategy is the random choice between PolarMix and Mix3Dfor the augmentation of each point within the point cloud data set.The results of the experiments conducted validate the efficacy of the RandomFusion strategy in enhancing the performance of neural network models for 3D lidar point cloud semantic segmentation tasks.This is achieved without compromising computational efficiency.By examining the potential of merging different augmentation techniques,the research contributes significantly to a more comprehensive understanding of how to utilize existing augmentation methods for 3D lidar point clouds.RandomFusion data augmentation technique offers a simple yet effective method to leverage the diversity of augmentation techniques and boost the robustness of models.The insights gained from this research can pave the way for future work aimed at developing more advanced and efficient data augmentation strategies for 3D lidar point cloud analysis.展开更多
For the first time, this article introduces a LiDAR Point Clouds Dataset of Ships composed of both collected and simulated data to address the scarcity of LiDAR data in maritime applications. The collected data are ac...For the first time, this article introduces a LiDAR Point Clouds Dataset of Ships composed of both collected and simulated data to address the scarcity of LiDAR data in maritime applications. The collected data are acquired using specialized maritime LiDAR sensors in both inland waterways and wide-open ocean environments. The simulated data is generated by placing a ship in the LiDAR coordinate system and scanning it with a redeveloped Blensor that emulates the operation of a LiDAR sensor equipped with various laser beams. Furthermore,we also render point clouds for foggy and rainy weather conditions. To describe a realistic shipping environment, a dynamic tail wave is modeled by iterating the wave elevation of each point in a time series. Finally, networks serving small objects are migrated to ship applications by feeding our dataset. The positive effect of simulated data is described in object detection experiments, and the negative impact of tail waves as noise is verified in single-object tracking experiments. The Dataset is available at https://github.com/zqy411470859/ship_dataset.展开更多
Owing to the constraints of unstructured environments,it is difficult to ensure safe,accurate,and smooth completion of tasks using autonomous robots.Moreover,for small-batch and customized tasks,autonomous operation r...Owing to the constraints of unstructured environments,it is difficult to ensure safe,accurate,and smooth completion of tasks using autonomous robots.Moreover,for small-batch and customized tasks,autonomous operation requires path planning for each task,thus reducing efficiency.We propose a human-robot shared control system based on a 3D point cloud and teleoperation for a robot to assist human operators in the performance of dangerous and cumbersome tasks.The system leverages the operator’s skills and experience to deal with emergencies and perform online error correction.In this framework,a depth camera acquires the 3D point cloud of the target object to automatically adjust the end-effector orientation.The operator controls the manipulator trajectory through a teleoperation device.The force exerted by the manipulator on the object is automatically adjusted by the robot,thus reducing the workload for the operator and improving the efficiency of task execution.In addition,hybrid force/motion control is used to decouple teleoperation from force control to ensure that force and position regulation will not interfere with each other.The proposed framework was validated using the ELITE robot to perform a force control scanning task.展开更多
To address the current issues of inaccurate segmentation and the limited applicability of segmentation methods for building facades in point clouds, we propose a facade segmentation algorithm based on optimal dual-sca...To address the current issues of inaccurate segmentation and the limited applicability of segmentation methods for building facades in point clouds, we propose a facade segmentation algorithm based on optimal dual-scale feature descriptors. First, we select the optimal dual-scale descriptors from a range of feature descriptors. Next, we segment the facade according to the threshold value of the chosen optimal dual-scale descriptors. Finally, we use RANSAC (Random Sample Consensus) to fit the segmented surface and optimize the fitting result. Experimental results show that, compared to commonly used facade segmentation algorithms, the proposed method yields more accurate segmentation results, providing a robust data foundation for subsequent 3D model reconstruction of buildings.展开更多
In this paper,a novel compression framework based on 3D point cloud data is proposed for telepresence,which consists of two parts.One is implemented to remove the spatial redundancy,i.e.,a robust Bayesian framework is...In this paper,a novel compression framework based on 3D point cloud data is proposed for telepresence,which consists of two parts.One is implemented to remove the spatial redundancy,i.e.,a robust Bayesian framework is designed to track the human motion and the 3D point cloud data of the human body is acquired by using the tracking 2D box.The other part is applied to remove the temporal redundancy of the 3D point cloud data.The temporal redundancy between point clouds is removed by using the motion vector,i.e.,the most similar cluster in the previous frame is found for the cluster in the current frame by comparing the cluster feature and the cluster in the current frame is replaced by the motion vector for compressing the current frame.The hrst,the B-SHOT(binary signatures of histograms orientation)descriptor is applied to represent the point feature for matching the corresponding point between two frames.The second,the K-mean algorithm is used to generate the cluster because there are a lot of unsuccessfully matched points in the current frame.The matching operation is exploited to find the corresponding clusters between the point cloud data of two frames.Finally,the cluster information in the current frame is replaced by the motion vector for compressing the current frame and the unsuccessfully matched clusters in the curren t and the motion vectors are transmit ted into the rem ote end.In order to reduce calculation time of the B-SHOT descriptor,we introduce an octree structure into the B-SHOT descriptor.In particular,in order to improve the robustness of the matching operation,we design the cluster feature to estimate the similarity bet ween two clusters.Experimen tai results have shown the bet ter performance of the proposed method due to the lower calculation time and the higher compression ratio.The proposed met hod achieves the compression ratio of 8.42 and the delay time of 1228 ms compared with the compression ratio of 5.99 and the delay time of 2163 ms in the octree-based compression method under conditions of similar distortion rate.展开更多
A new object-oriented method has been developed for the extraction of Mars rocks from Mars rover data. It is based on a combination of Mars rover imagery and 3D point cloud data. First, Navcam or Pancam images taken b...A new object-oriented method has been developed for the extraction of Mars rocks from Mars rover data. It is based on a combination of Mars rover imagery and 3D point cloud data. First, Navcam or Pancam images taken by the Mars rovers are segmented into homogeneous objects with a mean-shift algorithm. Then, the objects in the segmented images are classified into small rock candidates, rock shadows, and large objects. Rock shadows and large objects are considered as the regions within which large rocks may exist. In these regions, large rock candidates are extracted through ground-plane fitting with the 3D point cloud data. Small and large rock candidates are combined and postprocessed to obtain the final rock extraction results. The shape properties of the rocks (angularity, circularity, width, height, and width-height ratio) have been calculated for subsequent ~eological studies.展开更多
在自动驾驶感知系统中视觉传感器与激光雷达是关键的信息来源,但在目前的3D目标检测任务中大部分纯点云的网络检测能力都优于图像和激光点云融合的网络,现有的研究将其原因总结为图像与雷达信息的视角错位以及异构特征难以匹配,单阶段...在自动驾驶感知系统中视觉传感器与激光雷达是关键的信息来源,但在目前的3D目标检测任务中大部分纯点云的网络检测能力都优于图像和激光点云融合的网络,现有的研究将其原因总结为图像与雷达信息的视角错位以及异构特征难以匹配,单阶段融合算法难以充分融合二者的特征.为此,本文提出一种新的多层多模态融合的3D目标检测方法:首先,前融合阶段通过在2D检测框形成的锥视区内对点云进行局部顺序的色彩信息(Red Green Blue,RGB)涂抹编码;然后将编码后点云输入融合了自注意力机制上下文感知的通道扩充PointPillars检测网络;后融合阶段将2D候选框与3D候选框在非极大抑制之前编码为两组稀疏张量,利用相机激光雷达对象候选融合网络得出最终的3D目标检测结果.在KITTI数据集上进行的实验表明,本融合检测方法相较于纯点云网络的基线上有了显著的性能提升,平均mAP提高了6.24%.展开更多
Tree skeleton could be useful to agronomy researchers because the skeleton describes the shape and topological structure of a tree.The phenomenon of organs’mutual occlusion in fruit tree canopy is usually very seriou...Tree skeleton could be useful to agronomy researchers because the skeleton describes the shape and topological structure of a tree.The phenomenon of organs’mutual occlusion in fruit tree canopy is usually very serious,this should result in a large amount of data missing in directed laser scanning 3D point clouds from a fruit tree.However,traditional approaches can be ineffective and problematic in extracting the tree skeleton correctly when the tree point clouds contain occlusions and missing points.To overcome this limitation,we present a method for accurate and fast extracting the skeleton of fruit tree from laser scanner measured 3D point clouds.The proposed method selects the start point and endpoint of a branch from the point clouds by user’s manual interaction,then a backward searching is used to find a path from the 3D point cloud with a radius parameter as a restriction.The experimental results in several kinds of fruit trees demonstrate that our method can extract the skeleton of a leafy fruit tree with highly accuracy.展开更多
In this paper,we propose a novel and effective approach,namely GridNet,to hierarchically learn deep representation of 3D point clouds.It incorporates the ability of regular holistic description and fast data processin...In this paper,we propose a novel and effective approach,namely GridNet,to hierarchically learn deep representation of 3D point clouds.It incorporates the ability of regular holistic description and fast data processing in a single framework,which is able to abstract powerful features progressively in an efficient way.Moreover,to capture more accurate internal geometry attributes,anchors are inferred within local neighborhoods,in contrast to the fixed or the sampled ones used in existing methods,and the learned features are thus more representative and discriminative to local point distribution.GridNet delivers very competitive results compared with the state of the art methods in both the object classification and segmentation tasks.展开更多
基金supported in part by the National Natural Science Foundation of China under Grant Nos.U20A20197,62306187the Foundation of Ministry of Industry and Information Technology TC220H05X-04.
文摘In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and inherently sparse.Therefore,it is very difficult to extract long-range contexts and effectively aggregate local features for semantic segmentation in 3D point cloud space.Most current methods either focus on local feature aggregation or long-range context dependency,but fail to directly establish a global-local feature extractor to complete the point cloud semantic segmentation tasks.In this paper,we propose a Transformer-based stratified graph convolutional network(SGT-Net),which enlarges the effective receptive field and builds direct long-range dependency.Specifically,we first propose a novel dense-sparse sampling strategy that provides dense local vertices and sparse long-distance vertices for subsequent graph convolutional network(GCN).Secondly,we propose a multi-key self-attention mechanism based on the Transformer to further weight augmentation for crucial neighboring relationships and enlarge the effective receptive field.In addition,to further improve the efficiency of the network,we propose a similarity measurement module to determine whether the neighborhood near the center point is effective.We demonstrate the validity and superiority of our method on the S3DIS and ShapeNet datasets.Through ablation experiments and segmentation visualization,we verify that the SGT model can improve the performance of the point cloud semantic segmentation.
文摘This paper presents a method for hand gesture recognition based on 3D point cloud. Digital image processing technology is used in this research. Based on the 3D point from depth camera, the system firstly extracts some raw data of the hand. After the data segmentation and preprocessing, three kinds of appearance features are extracted, including the number of stretched fingers, the angles between fingers and the gesture region’s area distribution feature. Based on these features, the system implements the identification of the gestures by using decision tree method. The results of experiment demonstrate that the proposed method is pretty efficient to recognize common gestures with a high accuracy.
基金funded in part by the Key Project of Nature Science Research for Universities of Anhui Province of China(No.2022AH051720)in part by the Science and Technology Development Fund,Macao SAR(Grant Nos.0093/2022/A2,0076/2022/A2 and 0008/2022/AGJ)in part by the China University Industry-University-Research Collaborative Innovation Fund(No.2021FNA04017).
文摘This paper focuses on the effective utilization of data augmentation techniques for 3Dlidar point clouds to enhance the performance of neural network models.These point clouds,which represent spatial information through a collection of 3D coordinates,have found wide-ranging applications.Data augmentation has emerged as a potent solution to the challenges posed by limited labeled data and the need to enhance model generalization capabilities.Much of the existing research is devoted to crafting novel data augmentation methods specifically for 3D lidar point clouds.However,there has been a lack of focus on making the most of the numerous existing augmentation techniques.Addressing this deficiency,this research investigates the possibility of combining two fundamental data augmentation strategies.The paper introduces PolarMix andMix3D,two commonly employed augmentation techniques,and presents a new approach,named RandomFusion.Instead of using a fixed or predetermined combination of augmentation methods,RandomFusion randomly chooses one method from a pool of options for each instance or sample.This innovative data augmentation technique randomly augments each point in the point cloud with either PolarMix or Mix3D.The crux of this strategy is the random choice between PolarMix and Mix3Dfor the augmentation of each point within the point cloud data set.The results of the experiments conducted validate the efficacy of the RandomFusion strategy in enhancing the performance of neural network models for 3D lidar point cloud semantic segmentation tasks.This is achieved without compromising computational efficiency.By examining the potential of merging different augmentation techniques,the research contributes significantly to a more comprehensive understanding of how to utilize existing augmentation methods for 3D lidar point clouds.RandomFusion data augmentation technique offers a simple yet effective method to leverage the diversity of augmentation techniques and boost the robustness of models.The insights gained from this research can pave the way for future work aimed at developing more advanced and efficient data augmentation strategies for 3D lidar point cloud analysis.
基金supported by the National Natural Science Foundation of China (62173103)the Fundamental Research Funds for the Central Universities of China (3072022JC0402,3072022JC0403)。
文摘For the first time, this article introduces a LiDAR Point Clouds Dataset of Ships composed of both collected and simulated data to address the scarcity of LiDAR data in maritime applications. The collected data are acquired using specialized maritime LiDAR sensors in both inland waterways and wide-open ocean environments. The simulated data is generated by placing a ship in the LiDAR coordinate system and scanning it with a redeveloped Blensor that emulates the operation of a LiDAR sensor equipped with various laser beams. Furthermore,we also render point clouds for foggy and rainy weather conditions. To describe a realistic shipping environment, a dynamic tail wave is modeled by iterating the wave elevation of each point in a time series. Finally, networks serving small objects are migrated to ship applications by feeding our dataset. The positive effect of simulated data is described in object detection experiments, and the negative impact of tail waves as noise is verified in single-object tracking experiments. The Dataset is available at https://github.com/zqy411470859/ship_dataset.
基金supported by the National Natural Science Foundation of China(NSFC)(Grant No.U20A20200)the Major Research(Grant No.92148204)+1 种基金the Guangdong Basic and Applied Basic Research Foundation(Grant Nos.2019B1515120076 and 2020B1515120054)the Industrial Key Technologies R&D Program of Foshan(Grant Nos.2020001006308and 2020001006496)。
文摘Owing to the constraints of unstructured environments,it is difficult to ensure safe,accurate,and smooth completion of tasks using autonomous robots.Moreover,for small-batch and customized tasks,autonomous operation requires path planning for each task,thus reducing efficiency.We propose a human-robot shared control system based on a 3D point cloud and teleoperation for a robot to assist human operators in the performance of dangerous and cumbersome tasks.The system leverages the operator’s skills and experience to deal with emergencies and perform online error correction.In this framework,a depth camera acquires the 3D point cloud of the target object to automatically adjust the end-effector orientation.The operator controls the manipulator trajectory through a teleoperation device.The force exerted by the manipulator on the object is automatically adjusted by the robot,thus reducing the workload for the operator and improving the efficiency of task execution.In addition,hybrid force/motion control is used to decouple teleoperation from force control to ensure that force and position regulation will not interfere with each other.The proposed framework was validated using the ELITE robot to perform a force control scanning task.
文摘To address the current issues of inaccurate segmentation and the limited applicability of segmentation methods for building facades in point clouds, we propose a facade segmentation algorithm based on optimal dual-scale feature descriptors. First, we select the optimal dual-scale descriptors from a range of feature descriptors. Next, we segment the facade according to the threshold value of the chosen optimal dual-scale descriptors. Finally, we use RANSAC (Random Sample Consensus) to fit the segmented surface and optimize the fitting result. Experimental results show that, compared to commonly used facade segmentation algorithms, the proposed method yields more accurate segmentation results, providing a robust data foundation for subsequent 3D model reconstruction of buildings.
基金This work was supported by National Nature Science Foundation of China(No.61811530281 and 61861136009)Guangdong Regional Joint Foundation(No.2019B1515120076)the Fundamental Research for the Central Universities.
文摘In this paper,a novel compression framework based on 3D point cloud data is proposed for telepresence,which consists of two parts.One is implemented to remove the spatial redundancy,i.e.,a robust Bayesian framework is designed to track the human motion and the 3D point cloud data of the human body is acquired by using the tracking 2D box.The other part is applied to remove the temporal redundancy of the 3D point cloud data.The temporal redundancy between point clouds is removed by using the motion vector,i.e.,the most similar cluster in the previous frame is found for the cluster in the current frame by comparing the cluster feature and the cluster in the current frame is replaced by the motion vector for compressing the current frame.The hrst,the B-SHOT(binary signatures of histograms orientation)descriptor is applied to represent the point feature for matching the corresponding point between two frames.The second,the K-mean algorithm is used to generate the cluster because there are a lot of unsuccessfully matched points in the current frame.The matching operation is exploited to find the corresponding clusters between the point cloud data of two frames.Finally,the cluster information in the current frame is replaced by the motion vector for compressing the current frame and the unsuccessfully matched clusters in the curren t and the motion vectors are transmit ted into the rem ote end.In order to reduce calculation time of the B-SHOT descriptor,we introduce an octree structure into the B-SHOT descriptor.In particular,in order to improve the robustness of the matching operation,we design the cluster feature to estimate the similarity bet ween two clusters.Experimen tai results have shown the bet ter performance of the proposed method due to the lower calculation time and the higher compression ratio.The proposed met hod achieves the compression ratio of 8.42 and the delay time of 1228 ms compared with the compression ratio of 5.99 and the delay time of 2163 ms in the octree-based compression method under conditions of similar distortion rate.
基金supported by the National Natural Science Foundation of China(Nos.41171355and41002120)
文摘A new object-oriented method has been developed for the extraction of Mars rocks from Mars rover data. It is based on a combination of Mars rover imagery and 3D point cloud data. First, Navcam or Pancam images taken by the Mars rovers are segmented into homogeneous objects with a mean-shift algorithm. Then, the objects in the segmented images are classified into small rock candidates, rock shadows, and large objects. Rock shadows and large objects are considered as the regions within which large rocks may exist. In these regions, large rock candidates are extracted through ground-plane fitting with the 3D point cloud data. Small and large rock candidates are combined and postprocessed to obtain the final rock extraction results. The shape properties of the rocks (angularity, circularity, width, height, and width-height ratio) have been calculated for subsequent ~eological studies.
文摘在自动驾驶感知系统中视觉传感器与激光雷达是关键的信息来源,但在目前的3D目标检测任务中大部分纯点云的网络检测能力都优于图像和激光点云融合的网络,现有的研究将其原因总结为图像与雷达信息的视角错位以及异构特征难以匹配,单阶段融合算法难以充分融合二者的特征.为此,本文提出一种新的多层多模态融合的3D目标检测方法:首先,前融合阶段通过在2D检测框形成的锥视区内对点云进行局部顺序的色彩信息(Red Green Blue,RGB)涂抹编码;然后将编码后点云输入融合了自注意力机制上下文感知的通道扩充PointPillars检测网络;后融合阶段将2D候选框与3D候选框在非极大抑制之前编码为两组稀疏张量,利用相机激光雷达对象候选融合网络得出最终的3D目标检测结果.在KITTI数据集上进行的实验表明,本融合检测方法相较于纯点云网络的基线上有了显著的性能提升,平均mAP提高了6.24%.
基金This work is supported through grants from the National Natural Science Foundation of China(No.61762013)basic ability improvement project for young and middle-aged teachers in universities of Guangxi province(No.2018KY0078)+1 种基金Science and technology program of Guangxi(No.2018AD19339)Research Fund of Guangxi Key Lab of Multi-Source Information Mining and Security(No.20-A-02-02).
文摘Tree skeleton could be useful to agronomy researchers because the skeleton describes the shape and topological structure of a tree.The phenomenon of organs’mutual occlusion in fruit tree canopy is usually very serious,this should result in a large amount of data missing in directed laser scanning 3D point clouds from a fruit tree.However,traditional approaches can be ineffective and problematic in extracting the tree skeleton correctly when the tree point clouds contain occlusions and missing points.To overcome this limitation,we present a method for accurate and fast extracting the skeleton of fruit tree from laser scanner measured 3D point clouds.The proposed method selects the start point and endpoint of a branch from the point clouds by user’s manual interaction,then a backward searching is used to find a path from the 3D point cloud with a radius parameter as a restriction.The experimental results in several kinds of fruit trees demonstrate that our method can extract the skeleton of a leafy fruit tree with highly accuracy.
基金This work was supported by the National Natural Science Foundation of China(Grant No.61673033).
文摘In this paper,we propose a novel and effective approach,namely GridNet,to hierarchically learn deep representation of 3D point clouds.It incorporates the ability of regular holistic description and fast data processing in a single framework,which is able to abstract powerful features progressively in an efficient way.Moreover,to capture more accurate internal geometry attributes,anchors are inferred within local neighborhoods,in contrast to the fixed or the sampled ones used in existing methods,and the learned features are thus more representative and discriminative to local point distribution.GridNet delivers very competitive results compared with the state of the art methods in both the object classification and segmentation tasks.