基于点-体素一致性约束的城市激光雷达点云分类

Point⁃Voxel Consistency Constraint Network for LiDAR Point Cloud Classification Under Urban Scenes

导出

摘要准确高效的点云分类在场景理解和数字孪生城市建设等任务中发挥着关键作用。利用单一点、体素等视觉结构数据的点云分类方法容易丢失关键几何特征;融合多种结构数据的点云分类方法学习到不同数据的多层次、多尺度特征,但难以平衡不同数据之间的差异,降低了点云分类的准确性。因此,提出一个基于点-体素一致性约束的点云分类网络(PVCC-Net),用于准确分割城市场景中的不同尺寸地物。PVCC-Net采用双分支U-Net结构,体素和点分支分别负责提取粗粒度和细粒度特征,并利用点-体素一致性约束模块对齐粗细粒度特征,以减小不同粒度特征的分布差异。然后,所提网络采用点-体素自注意力机制自适应融合聚合后的粗粒度和细粒度特征,进而提升点云全局特征表达。引入Toronto3D、Semantic3D和SensatUrban三个城市场景点云数据集对PVCC-Net进行性能评估。结果显示,PVCC-Net分别取得了97.97%、93.80%和93.00%的总体精度(OA),以及82.92%、75.70%和55.40%的平均交并比(mIoU)。对比实验结果表明,相比基线方法,所提方法可以有效提升对复杂城市场景点云的分类性能,且获得更优的分类结果。 Objective Accurate and efficient point cloud classification plays a vital role in tasks such as scene understanding and digital twin city classification. Traditional classification methods manually extract features and construct discriminative models to classify point clouds. However, with the increasing density of point cloud acquisition and growth in data volume, it is difficult for traditional methods to achieve accurate and efficient point cloud classification. Recently developed deep learning-based point cloud processing methods promote the development of point cloud classification. Among them, methods using visual structural data, such as unique points or voxels, are prone to losing critical geometric features, whereas methods fusing multiple structural data can learn multilevel and multiscale features of different data. However, it is difficult to balance the differences between various data, which reduces the accuracy of point cloud classification. In addition, LiDAR point clouds acquired from complex urban scenes contain large amounts of noise and outliers that are difficult to process. These challenges have become a problem to be solved in current point cloud classification research.Methods To address these problems, a point-voxel consistency constraint network(PVCC-Net) is proposed to accurately segment point clouds with different sizes in urban scenes. The overall structure of PVCC-Net is designed with a dual-branch U-Net encodingdecoding structure. First, the point and voxel branches extract features from different receptive fields. The point branch extracts point-level geometric semantic features through a local feature aggregation(LFA) module, which helps reduce the effects of feature redundancy and noise. The voxel branch stepwise expands the receptive field by using a convolutional network to extract voxel features at different levels. The voxel format is regular and ordered in the memory, which maintains the continuity of spatial information and compensates for the shortcomings of point clouds. The point fine-grained feature and voxel coarse-grained feature branches cover a range of spatial scopes with different resolutions, thus combining this multilevel contextual information to enhance feature extraction capabilities. The point-voxel consistency constraint(PV-CC) module adequately integrates fine-grained and coarsegrained features and enhances the adaptive ability between point clouds and voxels by constraining the distances between feature branches of different granularities in the same layer of the network, which enables the model to produce more stable prediction results.Subsequently, the point-voxel self-attention(PV-SA) mechanism sufficiently fuses point and voxel features while enhancing the expression of the global features. Finally, the performance of the network is further improved via weighted cross-entropy and Lovasz loss functions, which result in accurate and efficient point cloud classification in urban scenes.Results and Discussion The proposed PVCC-Net is trained and evaluated on three urban scene datasets, namely, Toronto3D,Semantic3D, and SensatUrban, with performances of 97.97%, 93.80%, and 93.00% in terms of overall accuracy(OA) and 82.92%, 75.70%, and 55.40% in terms of mean intersection of union(mIoU), respectively. All experimental results outperform the Baseline network(Table 2, Fig. 6, and Fig. 9). In addition, PVCC-Net achieves competitive experimental results compared with other state-of-the-art methods, which fully demonstrates its strong generalizability(Tables 3 and 4). Notably, PVCC-Net not only maintains the integrity of the internal structure of the categories but also makes the segmentation boundaries between different categories clear and accurate(Figs. 4, 7, and 10). Comparative experimental and ablation studies demonstrate that different granular features have different semantic representation capabilities. The combination of fine-grained point features and coarse-grained voxel features can significantly improve the accuracy of point cloud classification, and the consistency constraint reduces the differences between different granularity features by minimizing the feature distance, thereby improving the stability and robustness of the model(Table 5). However, the complexity analysis indicates a higher number of parameters and FLOPs in PVCC-Net, mainly because the convolution and deconvolution operations in the voxel branch incurred considerable computational costs. However, the Latency is close to that of the point-based and point-voxel fusion methods(Table 6).Conclusions In this study, PVCC-Net is used for the LiDAR point cloud classification of urban scenes. The network first aligns the distribution of point fine-grained features and voxel coarse-grained features through a point-voxel consistency constraint module and then uses a point-voxel self-attention mechanism to capture long-distance context information, enhancing the global feature representation, and finally alleviating the imbalance of point cloud categories in the urban scene via the square-root-weighted crossentropy and Lovasz loss functions for accurate point cloud classification. On the Toronto3D, Semantic3D, and SensatUrban datasets, PVCC-Net improves the mIoU by 3.44 percentage points, 0.90 percentage points, and 2.30 percentage points,respectively, compared with RandLA-Net. In addition, the classification performance of PVCC-Net is comparable to that of other advanced methods. The results of comparative experiments and ablation studies show that deeply fused point fine-grained features and voxel coarse-grained features can enhance the capability of the model to extract complex features in urban scenes and further constrain point and voxel features to maintain the consistency of the feature distributions and improve the stability of the model prediction results. However, PVCC-Net has a higher number of parameters and computational cost. Therefore, in future research, we will explore the synergistic and complementary effects of points and voxels in a lightweight scene point cloud classification task.

作者李虎辰管海燕雷相达秦楠楠倪欢 Li Huchen;Guan Haiyan;Lei Xiangda;Qin Nannan;Ni Huan(School of Remote Sensing&Geomatics Engineering,Nanjing University of Information Science&Technology,Nanjing 210044,Jiangsu,China)

机构地区南京信息工程大学遥感与测绘工程学院

出处《中国激光》 EI CAS CSCD 北大核心 2024年第13期243-256,共14页 Chinese Journal of Lasers

基金自然资源部国土卫星遥感应用重点实验室开放基金资助(KLSMNR-G202305) 国家自然科学基金(41971414,42371447) 江苏省研究生科研与实践创新计划(KYCX22_1212)。

关键词遥感点云分类体素一致性约束自注意力机制城市场景 remote sensing point cloud classification voxel consistency constraint selfattention mechanism urban scene

分类号 P237 [天文地球—摄影测量与遥感]

引文网络
相关文献

参考文献9

1龙霄潇,程新景,朱昊,张朋举,刘浩敏,李俊,郑林涛,胡庆拥,刘浩,曹汛,杨睿刚,吴毅红,章国锋,刘烨斌,徐凯,郭裕兰,陈宝权.三维视觉前沿进展[J].中国图象图形学报,2021,26(6):1389-1428. 被引量：31
2景庄伟,管海燕,臧玉府,倪欢,李迪龙,于永涛.基于深度学习的点云语义分割研究综述[J].计算机科学与探索,2021,15(1):1-26. 被引量：41
3杨必胜,陈驰,董震.面向智能化测绘的城市地物三维提取[J].测绘学报,2022,51(7):1476-1484. 被引量：21
4曲彦霖,王悦,张倩,韩绍坤.一种基于空间特征注意力机制的点云分析方法[J].激光与光电子学进展,2023,60(24):232-242. 被引量：2
5刘友群,敖建锋,潘仲泰.DGPoint:用于三维点云语义分割的动态图卷积网络[J].激光与光电子学进展,2022,59(16):199-206. 被引量：7
6蒋腾平,杨必胜,周雨舟,朱润松,胡宗田,董震.道路点云场景双层卷积语义分割[J].武汉大学学报（信息科学版）,2020,45(12):1942-1948. 被引量：6
7方莉娜,沈贵熙,游志龙,郭迎亚,付化胜,赵志远,陈崇成.融合点云和多视图的车载激光点云路侧多目标识别[J].测绘学报,2021,50(11):1558-1573. 被引量：17
8杨必胜,韩旭,董震.适用于城市场景大规模点云语义标识的深度学习网络[J].测绘学报,2021,50(8):1059-1067. 被引量：14
9陈科,管海燕,雷相达,曹爽.基于特征增强核点卷积网络的多光谱LiDAR点云分类方法[J].地球信息科学学报,2023,25(5):1075-1087. 被引量：2

二级参考文献39

1杨必胜,魏征,李清泉,毛庆洲.面向车载激光扫描点云快速分类的点云特征图像生成方法[J].测绘学报,2010,39(5):540-545. 被引量：87
2李婷,詹庆明,喻亮.基于地物特征提取的车载激光点云数据分类方法[J].国土资源遥感,2012,24(1):17-21. 被引量：21
3史硕,龚威,祝波,宋沙磊.新型对地观测多光谱激光雷达及其控制实现[J].武汉大学学报（信息科学版）,2013,38(11):1294-1297. 被引量：5
4陈俊勇.面向数字中国建设中国的现代测绘基准——对我国“十五”大地测量工作的思考和建议[J].测绘通报,2001(3):1-3. 被引量：43
5熊汉江,郑先伟,丁友丽,张艺,吴秀杰,周妍.基于2D-3D语义传递的室内三维点云模型语义分割[J].武汉大学学报（信息科学版）,2018,43(12):2303-2309. 被引量：14
6孙杰,赖祖龙.利用随机森林的城区机载LiDAR数据特征选择与分类[J].武汉大学学报（信息科学版）,2014,39(11):1310-1313. 被引量：45
7杨必胜,梁福逊,黄荣刚.三维激光扫描点云数据处理研究进展、挑战与趋势[J].测绘学报,2017,46(10):1509-1516. 被引量：309
8罗海峰,方莉娜,陈崇成,黄志文.基于DBN的车载激光点云路侧多目标提取[J].测绘学报,2018,47(2):234-246. 被引量：18
9张艳国,李擎.基于惯性测量单元的激光雷达点云融合方法[J].系统仿真学报,2018,30(11):4334-4339. 被引量：19
10刘经南,詹骄,郭迟,李莹,吴杭彬,黄鹤.智能高精地图数据逻辑结构与关键技术[J].测绘学报,2019,48(8):939-953. 被引量：93

共引文献126

1王圆,毕玉革.基于无人机高光谱的荒漠草原地物精简学习分类模型[J].农业机械学报,2022,53(11):236-243. 被引量：2
2李振波,赵远洋,杨普,吴宇峰,李一鸣,郭若皓.基于机器视觉的鱼体长度测量研究综述[J].农业机械学报,2021,52(S01):207-218. 被引量：4
3杨鹏民.基于嵌入式Linux与深度视觉的井下多轴机械臂系统设计[J].煤炭工程,2022,54(12):90-96. 被引量：3
4黄露,陈春华.面向自然资源要素三维管理的关键技术研究与应用[J].测绘通报,2024(S01):112-115.
5贺异欣,张新长,吴福成,缪丹,叶建新,刘浩枫,李有甫,刘权.实景三维下的AIGC变形监测算法分析[J].测绘科学,2023,48(11):211-217. 被引量：1
6黄远程,陈领,江宇,许婷.融合注意力和多尺度表达的机载激光点云分类[J].测绘科学,2022,47(11):137-144.
7张春娇,徐胜华,刘正军,姜涛.顾及法向量的点云语义分割方法[J].测绘科学,2022,47(7):126-134. 被引量：3
8王丽欢,任雨,刘建,李军阔,宫世杰.基于B-PointNet++的地下电缆工井点云语义分割模型[J].国外电子测量技术,2023,42(2):88-94. 被引量：4
9刘亚坤,李永强,刘会云,孙渡,赵上斌.基于改进RANSAC算法的复杂建筑物屋顶点云分割[J].地球信息科学学报,2021,23(8):1497-1507. 被引量：18
10王荣超,张力,张涛,慕晓冬.基于视觉的三维重建技术分析及其军事应用研究[J].信息与电脑,2021,33(16):13-16. 被引量：4

1本刊评论员.抢占低空赛道,杭州如何“腾飞”?[J].杭州,2024(16):1-1.
2雷相达,管海燕,陈科,秦楠楠,臧玉府.基于数据增强与掩码学习的移动激光扫描点云分类方法[J].中国激光,2024,51(13):257-268.
3郭佩乾,郝峰杰,袁志昌,尹立坤,姜智霖,潘海宁,项淼毅.混合多类型储能的分布式能源系统运行优化方法[J].南方电网技术,2024,18(8):29-38. 被引量：1
4王国刚,李佳琪.随机缩放混合与跨尺度特征增强的任务对齐目标检测算法[J].计算机测量与控制,2024,32(9):225-233.
5何小英,徐伟铭,潘凯祥,王娟,李紫微.基于Swin Transformer与卷积神经网络的高分遥感影像分类[J].激光与光电子学进展,2024,61(14):245-256.
6朱红亚,汪欢欢.“五链”协同推动杭州低空经济高质量发展[J].杭州,2024(16):38-41.
7李新,孙钰奇,宋刘广,曾佳全.基于深度学习的室内点云语义分割研究进展[J].激光杂志,2024,45(8):6-18.
8郭敏,张熙涵,李阳.融合注意力的教师互一致性半监督医学图像分割[J].计算机工程,2024,50(9):313-323.
9刘刚,王瑾,周家柳,章磊,王珺珺,罗喜伶.无人机城市三维航迹规划可视化平台设计与实现[J].电子技术应用,2024,50(9):89-93.
10农文韬,孙雨桐,梅宇.融合语义信息的城市音频场景识别方法[J].无线互联科技,2024,21(17):81-87.

中国激光

2024年第13期

浏览历史

内容加载中请稍等...

基于点-体素一致性约束的城市激光雷达点云分类

参考文献9

二级参考文献39

共引文献126

相关作者

相关机构

相关主题

浏览历史