结合空间结构卷积和注意力机制的三维点云分类网络

Classification network for 3D point cloud based on spatialstructure convolution and attention mechanism

导出

摘要目的三维点云分类作为一项关键任务,在计算机视觉、机器人和自动驾驶等领域有着广泛的应用场景。现有的三维点云分类网络在使用边卷积进行局部特征提取时通常存在输入特征差异性小,空间结构信息提取、融合不充分等问题。针对上述问题,设计了一种结合空间结构卷积和注意力机制的点云分类网络。方法首先,提出一种空间结构卷积,在边卷积的基础上引入邻接点之间的相对位置信息来降低输入特征相似性,而后从结构和位置两个角度分别进行特征编码,实现更具多样性的局部几何结构捕获。其次,设计了全局特征编码模块,从坐标信息中提炼全局特征信息,同时在网络中融合了注意力机制,用于关联局部和全局特征表示,有效保留了全局特征信息,实现全局特征的适应性调整。最后,将局部几何结构信息和全局位置信息进行有效的融合,获得更具代表性和差异性的特征表征。结果设计实验在公开数据集ModelNet40上对提出的网络模型的性能进行评估,点云分类总体准确率和平均准确率分别达到93.0%和89.7%,具备良好的分类性能和预测效率。实验结果表明,空间结构卷积的使用有效增加了输入特征的多样性,位置和结构的单独编码有效提高了局部特征的表达能力。同时,提出的注意力加权方式在保留全局特征前提下实现了局部特征和全局特征的关联。结论提出的网络有较强的细粒度特征提取能力,具有良好的分类性能。 Objective 3D point cloud classification is a crucial task with diverse applications in computer vision,robotics,and autonomous driving.The advancement of computing device performance in recent years has enabled researchers toapply deep learning methods to the field of 3D point cloud recognition.Deep learning-based methods that are currently inuse for 3D point cloud classification typically divide the feature information captured by a network into two distinct parts:global and local features.Global features refer to the overall shape and structure of the point cloud,while local featurescapture more detailed information about individual points.By leveraging global and local features,these methods can achieve high accuracy in point cloud classification tasks.Edge convolution(EdgeConv)is currently the most widely usedmethod for local feature extraction in 3D point cloud classification.This method incorporates relative position vectors intofeature encoding to capture the characteristics of local structures effectively.However,when local structures in 3D pointclouds are similar,the use of relative positions in feature encoding may result in similar features,leading to poor classifica⁃tion results.Furthermore,encoding only local features may be insufficient for achieving optimal classification results,because considering the correlation between local and global features is also crucial.Current methods frequently employattention mechanisms to learn attention scores from global features and weigh local features accordingly,effectively estab⁃lishing the correlation between local and global features.However,these methods may not fully consider the importance ofglobal feature information and may suffer from suboptimal classification results.Method To address the aforementionedchallenges,this study proposes a novel 3D point cloud classification network that leverages spatial structure convolution(SSConv)and attention mechanisms.The proposed network architecture consists of two parts:a local feature encoding(LFE)module and a global feature encoding(GFE)module.The former uses SSConv to encode local features from loca⁃tion and structure,while the latter learns global feature representation from raw coordinate data.Furthermore,to enableeffective correlation and complementarity between feature information,we introduce an attention mechanism that facilitatesadaptive adjustment of global features through weighted operations.The LFE module is composed of two operations:graphconstruction and feature extraction.The LFE module performs the K-nearest neighbor(KNN)algorithm to identify adja⁃cent points and construct a graph structure.SSConv is a crucial feature extraction operation that involves a multilayer per⁃ceptron.Compared with EdgeConv,SSConv introduces a relative position vector between adjacent points.This operationeffectively increases the correlation distance between raw input data,enriches local region structure information,andenhances the spatial expression ability of the extracted high-level semantic information.To capture more effective localstructure features,feature extraction is encoded separately on the basis of structure and location.In particular,the locationencoding branch encodes the coordinate information separately to obtain richer location feature information for describingthe spatial location of each point.Meanwhile,the structure encoding branch encodes the relative location vector separatelyto learn the structure information in the local region for describing the overall geometric structure of the local neighborhood.The global feature encoding module maps raw coordinate data to high-dimensional features,which are used as the globalfeature representation of the point cloud.In addition,the module includes an attention mechanism to enhance the correla⁃tion between local and global features.In particular,an attention weighting method is used to guide the learning of globalfeature information by using local feature information.This operation enables correlation and fusion between local andglobal feature representations while preserving raw feature information.Result To evaluate the performance of the proposednetwork model,experimental validation is conducted on the publicly available ModelNet40 dataset,which consists of 9843training models and 2468 testing models in 40 classes.Classification performance was evaluated using metrics,such asoverall accuracy(OA)and mean accuracy(mAcc),in the experiments.To evaluate classification performance,the pro⁃posed model was evaluated against four pointwise methods,two convolution-based methods,two graph convolution-basedmethods,and four attention mechanism-based methods.The experimental results demonstrate that the proposed networkexhibits good performance in the point cloud classification task and is capable of effectively representing local and globalfeatures.The proposed method achieves an OA of 93.0%,outperforming dynamic graph convolutional neural network(DGCNN)by 0.1%,PointWeb by 0.7%,and PointCNN by 0.8%.In addition,the mAcc of the proposed method reaches89.7%.Furthermore,an experiment was designed to validate the efficacy of SSConv.By replacing SSConv with EdgeConvin the network architecture,the experimental results indicate a reduction in OA of 0.5%on the ModelNet40 dataset,demon⁃strating that SSConv is better suited for local representation than EdgeConv.Meanwhile,an experiment was designed toverify the diversity of input features of SSConv.The correlation of features was evaluated using Euclidean,cosine,and corre⁃lation distances.The results indicate that SSConv enhances diversity among input features more effectively than EdgeConv.Furthermore,the visualization results of the intermediate layer features in the model demonstrate that SSConv can learn moredistinctive features.Conclusion The proposed network model achieves better classification results,with an OA of 93.0%and an mAcc of 89.7%,surpassing those of existing methods.The proposed spatially structured convolution effectivelyenhances the variability of input features,allowing the model to learn more diverse local feature representations of objects.The proposed global feature coding method based on the attention mechanism effectively adjusts features and fully extractsthe relationship between local and global feature information while preserving global features.To summarize,the proposednetwork model exhibits good capability for fine-grained feature extraction and achieves good classification performance.

作者武斌刘溢安赵洁 Wu Bin;Liu Yian;Zhao Jie(School of Computer and Information Engineering,Tianjin Chengjian University,Tianjin 300384,China)

机构地区天津城建大学计算机与信息工程学院

出处《中国图象图形学报》 CSCD 北大核心 2024年第2期520-532,共13页 Journal of Image and Graphics

基金天津市重点研发计划科技支撑重点项目(19YFZCGX00130) 天津市企业科技特派员项目(19JCTPJC47200) 天津市研究生科研创新项目(2021YJSS351)。

关键词点云边卷积(EdgeConv) 空间结构注意力机制分类 point cloud edge convolution(EdgeConv) spatial structure attention mechanism classification

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献5

1陈涵娟,达飞鹏,盖绍彦.基于竞争注意力融合的深度三维点云分类网络[J].浙江大学学报（工学版）,2021,55(12):2342-2351. 被引量：5
2邓林涛,方志军.基于特征负反馈卷积的点云分析方法[J].激光与光电子学进展,2022,59(12):93-105. 被引量：3
3Meng-Hao Guo,Jun-Xiong Cai,Zheng-Ning Liu,Tai-Jiang Mu,Ralph R.Martin,Shi-Min Hu.PCT:Point cloud transformer[J].Computational Visual Media,2021,7(2):187-199. 被引量：111
4宋巍,蔡万源,何盛琪,李文俊.结合动态图卷积和空间注意力的点云分类与分割[J].中国图象图形学报,2021,26(11):2691-2702. 被引量：8
5项学泳,李广云,王力,宗文鹏,吕志鹏,向奉卓.利用局部几何特征与空洞邻域的点云语义分割[J].武汉大学学报（信息科学版）,2023,48(4):534-541. 被引量：3

二级参考文献6

1闫利,谢洪,胡晓斌,鲍秀武.一种新的点云平面混合分割方法[J].武汉大学学报（信息科学版）,2013,38(5):517-521. 被引量：19
2熊汉江,郑先伟,丁友丽,张艺,吴秀杰,周妍.基于2D-3D语义传递的室内三维点云模型语义分割[J].武汉大学学报（信息科学版）,2018,43(12):2303-2309. 被引量：14
3杨必胜,梁福逊,黄荣刚.三维激光扫描点云数据处理研究进展、挑战与趋势[J].测绘学报,2017,46(10):1509-1516. 被引量：310
4张良培,张云,陈震中,肖佩珮,罗斌.基于分裂合并的多模型拟合方法在点云分割中的应用[J].测绘学报,2018,47(6):833-843. 被引量：10
5戴仁月,方志军,高永彬.融合扩张卷积网络与SLAM的无监督单目深度估计[J].激光与光电子学进展,2020,57(6):106-114. 被引量：7
6Shi-Min HU,Dun LIANG,Guo-Ye YANG,Guo-Wei YANG,Wen-Yang ZHOU.Jittor:a novel deep learning framework with meta-operators and unified graph execution[J].Science China(Information Sciences),2020,63(12):114-134. 被引量：16

共引文献121

1ZHANG Ying,SUN Yue,WU Lin,ZHANG Lulu,MENG Bumin.3D Point Cloud Semantic Segmentation Based PAConv and SE_variant[J].Instrumentation,2023,10(4):27-38.
2钟侠骄,张绍兵,郭静,王胜朝,成苗,何莲,赵铱民.基于RandLA-Net的3D点云牙颌分割与身份识别[J].计算机应用,2023,43(S01):269-275.
3王恺,王腾飞,王庆栋,韩晓霞.局部全连接图编码的点云语义分割网络[J].测绘科学,2024,49(5):200-208.
4Meng-Hao Guo,Zheng-Ning Liu,Tai-Jiang Mu,Dun Liang,Ralph R.Martin,Shi-Min Hu.Can attention enable MLPs to catch up with CNNs?[J].Computational Visual Media,2021,7(3):283-288. 被引量：1
5Hao-Xuan Song,Jiahui Huang,Yan-Pei Cao,Tai-Jiang Mu.HDR-Net-Fusion:Real-time 3D dynamic scene reconstruction with a hierarchical deep reinforcement network[J].Computational Visual Media,2021,7(4):419-435. 被引量：1
6高金金,李潞洋.一种改进的点云Transformer深度学习模型[J].中北大学学报（自然科学版）,2021,42(6):515-523. 被引量：5
7刘心溥,马燕新,许可,万建伟,郭裕兰.嵌入Transformer结构的多尺度点云补全[J].中国图象图形学报,2022,27(2):538-549. 被引量：14
8孙刘杰,赵进,王文举,张煜森.多尺度Transformer激光雷达点云3D物体检测[J].计算机工程与应用,2022,58(8):136-146. 被引量：2
9曾安,彭杰威,刘畅,潘丹,蒋艳荣,张小波.基于多尺度几何感知Transformer的植物点云补全网络[J].农业工程学报,2022,38(4):198-205. 被引量：5
10Shi-Min Hu.Message from the Editor-in-Chief[J].Computational Visual Media,2022,8(1):1-1.

1方小宇,黄丽佳.基于全局位置信息和残差特征融合的SAR船舶检测算法[J].系统工程与电子技术,2024,46(3):839-848.

中国图象图形学报

2024年第2期

浏览历史

内容加载中请稍等...

结合空间结构卷积和注意力机制的三维点云分类网络

参考文献5

二级参考文献6

共引文献121

相关作者

相关机构

相关主题

浏览历史