利用隐式解码器的三维模型簇协同分割

Co-segmentation of 3D shape clusters based on implicit decoder

导出

摘要目的为建立3维模型语义部件之间的对应关系并实现模型自动分割,提出一种利用隐式解码器(implicit decoder,IM-decoder)的无监督3维模型簇协同分割网络。方法首先对3维点云模型进行体素化操作,进而由CNN-encoder(convolutional neural network encoder)提取体素化点云模型的特征,并将模型信息映射至特征空间。然后使用注意力模块聚合3维模型相邻点特征,将聚合特征与3维点坐标作为IM-decoder的输入来增强模型的空间感知能力,并输出采样点相对于模型部件的内外状态。最后使用max pooling聚合解码器生成的隐式场,以得到模型的协同分割结果。结果实验结果表明,本文算法在Shape Net Part数据集上的m Io U(mean intersection-overunion)为62.1%,与目前已知的两类无监督3维点云模型分割方法相比,分别提高了22.5%和18.9%,分割性能得到了极大提升。与两种有监督方法相比,分别降低了19.3%和20.2%,但其在部件数较少的模型上可获得更优的分割效果。相比于使用交叉熵函数作为重构损失函数,本文使用均方差函数可获得更高的分割准确率,mIoU提高了26.3%。结论与当前主流的无监督分割算法相比,本文利用隐式解码器进行3维模型簇协同分割的无监督方法分割准确率更高。 Objective3 D shape segmentation is an important task,without which many 3 D data processing applications cannot accomplish their work.It has also become a hot research topic in areas,such as digital geometric processing and modeling,and plays a crucial role in finalizing tasks such as 3 D printing,3 D shape retrieval,and medical organ segmentation.Recent years have witnessed the continuous development of 3 D data acquisition equipment such as laser scanners,RGBD cameras,and stereo cameras,which has resulted in 3 D point cloud data enjoying wide usage in 3 D shape segmentation tasks.Based on the analysis of the shape the 3 D point cloud takes,3 D point cloud segmentation methods involving deep learning solutions are divided into three categories by related research scholars:1)volumetric-based methods,2)viewbased methods,and 3)point-based methods.Volumetric-based methods first use voxels in 3 D space as the definition domain to perform 3 D convolution and then expand the convolutional neural network(CNN)to 3 D space for feature learning.Finally,point cloud shape segmentation can be realized by aggregating the acquired features.View-based methods use spatial projection to convert the input 3 D shape into multiple 2 D image views,inputting the stack of images into a 2 D CNN to extract the input point cloud shape features,and then,for a refinement of the segmentation results,the input 3 D shape features are further processed through the view pool and the CNN.To accommodate situations in which the points of the input cloud are disorderly and irregularly dispersed,point-based methods set up a specific neural network input layer to input the 3 D point cloud directly into the network for training to improve the segmentation performance of the 3 D point cloud shape.The network cannot achieve efficient co-segmentation of the shape clusters by employing component reconstruction techniques because typical point cloud data lack topology and surface information,and the labeling large data sets is difficult.Considering human beings’notion of object recognition,which is based on parts,as well as other factors,such as the instability of the segmentation caused by the influence of occlusion and the illumination and projection angle in the viewbased methods,voxelization of point cloud data is selected in this paper.Moreover,most of the existing deep learning methods used for 3 D shape segmentation adopt a supervisory mechanism,and the implementation of automatic 3 D shape segmentation methods is difficult without effective usage of the potential connections between shapes.Thus,an unsupervised 3 D shape cluster co-segmentation network,based on the implicit decoder(IM-decoder),is used for the realization of the correspondence between semantically related components and the automatic segmentation of 3 D shapes in this paper.MethodThe unsupervised 3 D shape cluster co-segmentation method,based on the implicit decoder,is divided mainly into three important operations:encoding,feature aggregation,and decoding.The first task of the encoding stage is to carry out an accurate extraction of the features from the input 3 D shape.The encoder network designed in this paper is based on traditional CNNs,and the encoder can only process regular 3 D data.First,voxelization is carried out on all the points that represent the shape in 3 D point cloud form.Then,the Hierarchical Surface Prediction method is used to improve the quality of the reconstructed 3 D shape.Finally,the features of the voxelized points are extracted through the CNN encoder,and the shape information is mapped to the feature space.The feature aggregation operation further improves the quality of the extracted features by using the attention module,which aggregates the features of adjacent points in the 3 D shape.During the decoding stage,the aggregated features and the 3 D coordinates of the points are input to the IM-decoder for an enhancement of the spatial perception of the shape,and the internal and external states of the sampling points relative to the shape components are output after this enhancement.The final co-segmentation is accomplished by a max pooling operation,which is realized through aggregating the implicit fields generated by the decoder.ResultIn this paper,ablation and comparative experiments are conducted on the Shape Net Part dataset using intersection over union(IoU)and mean intersection over union(mIoU)as evaluation criteria.Experimental results show that the m Io U achieved by our algorithm,when invoked on the Shape Net Part dataset,reaches 62.1%.Compared with the currently known two types of unsupervised 3 D point cloud shape segmentation methods,its m Io U is increased by 22.5%and 18.9%,and the segmentation performance is greatly improved.Compared with the two supervised methods,the m Io U of this algorithm is reduced by 19.3%and 20.2%,but our method could achieve a better segmentation effect on shapes with fewer parts.Moreover,the choice of using the mean square error function as the reconstruction loss function,instead of using the cross-entropy function,results in a higher segmentation accuracy,which is manifested by an improvement of 26.3%,in terms of m Io U.The ablation experiment shows that the attention module designed in this paper could improve the segmentation accuracy of the network by automatically selecting important features from each shape type.ConclusionThe experimental results show that the 3 D shape cluster co-segmentation method,which is based on the implicit decoder,achieves a high segmentation accuracy.On the one hand,the method uses CNN-encoder to extract the features of the 3 D shape and designs the attention module such that important features are automatically selected,which can further improve the quality of the features.On the other hand,the implicit decoder,constructed by our method,performs collaborative analysis on the joint feature vector,which is composed of the selectively chosen features and the 3 D coordinates of the points.Moreover,the implicit field resulting from the fine-tuning of the reconstruction loss function could effectively improve the accuracy of the segmentation.

作者杨军张敏敏 Yang Jun;Zhang Minmin(Faculty of Geomatics,Lanzhou Jiaotong University,Lanzhou 730070,China;School of Automation and Electrical Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China)

机构地区兰州交通大学测绘与地理信息学院兰州交通大学自动化与电气工程学院

出处《中国图象图形学报》 CSCD 北大核心 2022年第2期550-561,共12页 Journal of Image and Graphics

基金国家自然科学基金项目(61862039) 甘肃省科技计划资助项目(20JR5RA429) 兰州市人才创新创业项目(2020-RC-22) 兰州交通大学天佑创新团队项目(TY202002)。

关键词协同分割模型簇隐式解码器注意力模块无监督 co-segmentation shape clusters implicit decoder attention module unsupervised

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献4

1牛辰庚,刘玉杰,李宗民,李华.基于点云数据的三维目标识别和模型分割方法[J].图学学报,2019,40(2):274-281. 被引量：17
2王晓辉,吴禄慎,陈华伟,胡赟,石雅莹.基于区域聚类分割的点云特征线提取[J].光学学报,2018,38(11):58-67. 被引量：33
3杨军,王顺,周鹏.基于深度体素卷积神经网络的三维模型识别分类[J].光学学报,2019,39(4):306-316. 被引量：20
4杨军,张鹏.结合拓扑持续性和热扩散理论的3维模型分割[J].中国图象图形学报,2018,23(6):887-895. 被引量：4

二级参考文献10

1苏梦,万丽莉,苗振江.一种基于扩散几何的非刚体三维形状分割方法[J].计算机辅助设计与图形学学报,2015,27(4):605-613. 被引量：4
2潘翔,张三元,张引,叶修梓.一种基于拓扑连接图的三维模型检索方法[J].计算机学报,2004,27(9):1250-1255. 被引量：22
3孙晓鹏,李华.三维网格模型的分割及应用技术综述[J].计算机辅助设计与图形学学报,2005,17(8):1647-1655. 被引量：49
4马亚奇,李忠科,王先泽,赵静,张晓娟.基于体半径函数的网格分割算法[J].计算机工程,2011,37(22):240-242. 被引量：4
5聂建辉,刘烨,高浩,王保云,葛毓琴.基于符号曲面变化度与特征分区的点云特征线提取算法[J].计算机辅助设计与图形学学报,2015,27(12):2332-2339. 被引量：12
6张爱武,肖涛,段乙好.一种机载LiDAR点云分类的自适应特征选择方法[J].激光与光电子学进展,2016,53(8):267-277. 被引量：14
7肖进胜,刘恩雨,朱力,雷俊锋.改进的基于卷积神经网络的图像超分辨率算法[J].光学学报,2017,37(3):96-104. 被引量：63
8王晓辉,吴禄慎,陈华伟,史皓良.应用改进的粒子群优化模糊聚类实现点云数据的区域分割[J].光学精密工程,2017,25(4):1095-1105. 被引量：23
9曲磊,王康如,陈利利,李嘉茂,张晓林.基于RGBD图像和卷积神经网络的快速道路检测[J].光学学报,2017,37(10):116-124. 被引量：24
10李明磊,宗文鹏,李广云,王力.基于体素生长的点云结构直线段提取[J].光学学报,2018,38(1):144-154. 被引量：15

共引文献69

1史律,唐鸣.深度学习在目标识别中的应用研究[J].舰船科学技术,2019,0(22):82-84. 被引量：1
2王立波,林重才,韩锦文.基于BIM与三维扫描的建筑物智能拆除方法研究[J].工业建筑,2023,53(2):12-21. 被引量：2
3沈斌.50岁以下脑梗塞54例临床和病因分析[J].泸州医学院学报,2000,23(1):76-76. 被引量：3
4朱清海.一种基于八叉树车载激光点云的杆式地物批量提取方法[J].测绘通报,2019(S2):110-111. 被引量：15
5王博,杨军.基于改进的函数映射理论的三维模型间对应关系[J].兰州交通大学学报,2019,38(3):21-30. 被引量：1
6胡佳贝,刘喆,张鹏飞,耿国华,张雨禾.基于离散Morse理论的散乱点云特征提取[J].光学学报,2019,39(6):224-233. 被引量：13
7袁俏俏,章光,陈西江,徐卫青.融合改进Canny算法的点云特征规则化[J].激光与光电子学进展,2019,56(16):199-206. 被引量：6
8周燕,曾凡智,吴臣,罗粤,刘紫琴.基于深度学习的三维形状特征提取方法[J].计算机科学,2019,46(9):47-58. 被引量：1
9金卓,张自宾,陈朋.基于点云特征线提取的开采沉陷区建筑物倾斜测量[J].金属矿山,2019,48(10):178-182. 被引量：4
10张溪溪,纪小刚,胡海涛,张建安,栾宇豪.微型复杂曲面零件散乱点云特征点提取[J].机械设计与研究,2019,35(5):1-5. 被引量：8

1刘松波.基于多元化教学方法构建高中数学高效课堂[J].高中数理化,2021(24):19-19.
2车满强,李树斌,李铭.基于HarDNet全卷积网络的道路路面语义分割方法[J].计算机应用,2021,41(S02):76-80. 被引量：3
3许启贤,黄健,李凡.基于多任务学习的高光谱图像语义分割算法[J].中国科技论文,2022,17(3):240-245.
4马鑫,尚毅梓,胡昊,徐杨.基于数据特征增强和残差收缩网络的变压器故障识别方法[J].电力系统自动化,2022,46(3):175-183. 被引量：29
5陈佐瓒,徐兵,丁小军,甘井中.基于Encoder-Decoder框架的双监督机制自然场景文本识别[J].计算机工程与应用,2022,58(6):128-133. 被引量：2
6沈祺宗,高春艳.融合边缘检测模块的自然地貌语义分割模型研究[J].系统仿真学报,2022,34(2):293-302. 被引量：1
7王晓川.浅谈企业内部大监督体系建设与实施[J].当代电力文化,2022(1):62-63. 被引量：1
8李家辉.房屋建筑施工中质量监督措施分析[J].工程与建设,2022,36(1):244-245.
9张俊红,孙诗跃,朱小龙,周启迪,戴胡伟,林杰威.基于改进卷积神经网络的柴油机故障诊断方法研究[J].振动与冲击,2022,41(6):139-146. 被引量：16
10安鑫,代子彪,李阳,孙晓,任福继.基于BERT的端到端语音合成方法[J].计算机科学,2022,49(4):221-226. 被引量：8

中国图象图形学报

2022年第2期

浏览历史

内容加载中请稍等...

利用隐式解码器的三维模型簇协同分割

参考文献4

二级参考文献10

共引文献69

相关作者

相关机构

相关主题

浏览历史