期刊文献+

利用隐式解码器的三维模型簇协同分割

Co-segmentation of 3D shape clusters based on implicit decoder
原文传递
导出
摘要 目的为建立3维模型语义部件之间的对应关系并实现模型自动分割,提出一种利用隐式解码器(implicit decoder,IM-decoder)的无监督3维模型簇协同分割网络。方法首先对3维点云模型进行体素化操作,进而由CNN-encoder(convolutional neural network encoder)提取体素化点云模型的特征,并将模型信息映射至特征空间。然后使用注意力模块聚合3维模型相邻点特征,将聚合特征与3维点坐标作为IM-decoder的输入来增强模型的空间感知能力,并输出采样点相对于模型部件的内外状态。最后使用max pooling聚合解码器生成的隐式场,以得到模型的协同分割结果。结果实验结果表明,本文算法在Shape Net Part数据集上的m Io U(mean intersection-overunion)为62.1%,与目前已知的两类无监督3维点云模型分割方法相比,分别提高了22.5%和18.9%,分割性能得到了极大提升。与两种有监督方法相比,分别降低了19.3%和20.2%,但其在部件数较少的模型上可获得更优的分割效果。相比于使用交叉熵函数作为重构损失函数,本文使用均方差函数可获得更高的分割准确率,mIoU提高了26.3%。结论与当前主流的无监督分割算法相比,本文利用隐式解码器进行3维模型簇协同分割的无监督方法分割准确率更高。 Objective3 D shape segmentation is an important task,without which many 3 D data processing applications cannot accomplish their work.It has also become a hot research topic in areas,such as digital geometric processing and modeling,and plays a crucial role in finalizing tasks such as 3 D printing,3 D shape retrieval,and medical organ segmentation.Recent years have witnessed the continuous development of 3 D data acquisition equipment such as laser scanners,RGBD cameras,and stereo cameras,which has resulted in 3 D point cloud data enjoying wide usage in 3 D shape segmentation tasks.Based on the analysis of the shape the 3 D point cloud takes,3 D point cloud segmentation methods involving deep learning solutions are divided into three categories by related research scholars:1)volumetric-based methods,2)viewbased methods,and 3)point-based methods.Volumetric-based methods first use voxels in 3 D space as the definition domain to perform 3 D convolution and then expand the convolutional neural network(CNN)to 3 D space for feature learning.Finally,point cloud shape segmentation can be realized by aggregating the acquired features.View-based methods use spatial projection to convert the input 3 D shape into multiple 2 D image views,inputting the stack of images into a 2 D CNN to extract the input point cloud shape features,and then,for a refinement of the segmentation results,the input 3 D shape features are further processed through the view pool and the CNN.To accommodate situations in which the points of the input cloud are disorderly and irregularly dispersed,point-based methods set up a specific neural network input layer to input the 3 D point cloud directly into the network for training to improve the segmentation performance of the 3 D point cloud shape.The network cannot achieve efficient co-segmentation of the shape clusters by employing component reconstruction techniques because typical point cloud data lack topology and surface information,and the labeling large data sets is difficult.Considering human beings’notion of object recognition,which is based on parts,as well as other factors,such as the instability of the segmentation caused by the influence of occlusion and the illumination and projection angle in the viewbased methods,voxelization of point cloud data is selected in this paper.Moreover,most of the existing deep learning methods used for 3 D shape segmentation adopt a supervisory mechanism,and the implementation of automatic 3 D shape segmentation methods is difficult without effective usage of the potential connections between shapes.Thus,an unsupervised 3 D shape cluster co-segmentation network,based on the implicit decoder(IM-decoder),is used for the realization of the correspondence between semantically related components and the automatic segmentation of 3 D shapes in this paper.MethodThe unsupervised 3 D shape cluster co-segmentation method,based on the implicit decoder,is divided mainly into three important operations:encoding,feature aggregation,and decoding.The first task of the encoding stage is to carry out an accurate extraction of the features from the input 3 D shape.The encoder network designed in this paper is based on traditional CNNs,and the encoder can only process regular 3 D data.First,voxelization is carried out on all the points that represent the shape in 3 D point cloud form.Then,the Hierarchical Surface Prediction method is used to improve the quality of the reconstructed 3 D shape.Finally,the features of the voxelized points are extracted through the CNN encoder,and the shape information is mapped to the feature space.The feature aggregation operation further improves the quality of the extracted features by using the attention module,which aggregates the features of adjacent points in the 3 D shape.During the decoding stage,the aggregated features and the 3 D coordinates of the points are input to the IM-decoder for an enhancement of the spatial perception of the shape,and the internal and external states of the sampling points relative to the shape components are output after this enhancement.The final co-segmentation is accomplished by a max pooling operation,which is realized through aggregating the implicit fields generated by the decoder.ResultIn this paper,ablation and comparative experiments are conducted on the Shape Net Part dataset using intersection over union(IoU)and mean intersection over union(mIoU)as evaluation criteria.Experimental results show that the m Io U achieved by our algorithm,when invoked on the Shape Net Part dataset,reaches 62.1%.Compared with the currently known two types of unsupervised 3 D point cloud shape segmentation methods,its m Io U is increased by 22.5%and 18.9%,and the segmentation performance is greatly improved.Compared with the two supervised methods,the m Io U of this algorithm is reduced by 19.3%and 20.2%,but our method could achieve a better segmentation effect on shapes with fewer parts.Moreover,the choice of using the mean square error function as the reconstruction loss function,instead of using the cross-entropy function,results in a higher segmentation accuracy,which is manifested by an improvement of 26.3%,in terms of m Io U.The ablation experiment shows that the attention module designed in this paper could improve the segmentation accuracy of the network by automatically selecting important features from each shape type.ConclusionThe experimental results show that the 3 D shape cluster co-segmentation method,which is based on the implicit decoder,achieves a high segmentation accuracy.On the one hand,the method uses CNN-encoder to extract the features of the 3 D shape and designs the attention module such that important features are automatically selected,which can further improve the quality of the features.On the other hand,the implicit decoder,constructed by our method,performs collaborative analysis on the joint feature vector,which is composed of the selectively chosen features and the 3 D coordinates of the points.Moreover,the implicit field resulting from the fine-tuning of the reconstruction loss function could effectively improve the accuracy of the segmentation.
作者 杨军 张敏敏 Yang Jun;Zhang Minmin(Faculty of Geomatics,Lanzhou Jiaotong University,Lanzhou 730070,China;School of Automation and Electrical Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China)
出处 《中国图象图形学报》 CSCD 北大核心 2022年第2期550-561,共12页 Journal of Image and Graphics
基金 国家自然科学基金项目(61862039) 甘肃省科技计划资助项目(20JR5RA429) 兰州市人才创新创业项目(2020-RC-22) 兰州交通大学天佑创新团队项目(TY202002)。
关键词 协同分割 模型簇 隐式解码器 注意力模块 无监督 co-segmentation shape clusters implicit decoder attention module unsupervised
  • 相关文献

参考文献4

二级参考文献10

共引文献69

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部