摘要
针对利用深度学习进行三维体素重建时网络结构通常较为复杂且训练过程需要大量标注的问题,提出了基于深度学习的三维体素重建改进方法。该方法在训练或测试时不需要任何图像注释或对象标签,且网络模型中去除了冗杂的残差模块。为进一步提升重建模型精度,网络首先利用标准的CNN结构对输入图像编码为低维特征,然后利用LSTM单元选择性的更新它们的单元状态或维持原状态,最后解码器解码LSTM单元的隐藏状态并完成3D概率体素重建。使用端到端的网络从大量的合成数据中学习目标物体图像到其3D形状的映射,通过训练编码器和解码器,使得训练模型能够接收目标物体的一个或多个任意角度的图像,并输出该物体的体素模型。在ShapeNet数据集上的实验证明了,此改进方法能在无残差模块且占用较少资源情况下获得更好的重建效果。
An improved method for 3D voxel reconstruction based on deep learning is proposed to address the problem that the network structure is usually complex and the training process requires a large number of annotations when using deep learning for 3D voxel reconstruction.The method does not require any image annotation or object labe-ling during training or testing,and the redundant residual modules are removed from the network model.To further im-prove the reconstruction model accuracy,the network first encodes the input images as low-dimensional features using standard CNN structures,then selectively updates their cell states or maintains the original states using LSTM cells,and finally the decoder decodes the hidden states of the LSTM cells and completes the 3D probabilistic voxel recon-struction.The end-to-end network is used to learn the mapping of the target object image to its 3D shape from a large amount of synthetic data by training the encoder and decoder so that the training model can receive one or more arbi-trary angles of the target object and output the voxel model of the object.Experiments on the ShapeNet dataset demon-strate that this improved method can achieve better reconstruction results without residual modules and with less re-source consumption.
作者
朱德榕
贺赛先
ZHU Derong;HE Saixian(Electronic Information School,Wuhan University,Wuhan 430000,China)
出处
《激光杂志》
CAS
北大核心
2021年第8期39-44,共6页
Laser Journal