摘要
基于深度学习的单视图三维重建是当前的研究热点。为重建出更多的高频细节,SDF-SRN算法引入了位置编码,但在缺乏精确监督时,网络容易过拟合而导致凹凸不平的重建结果。针对这个问题,提出一种基于稀疏特征的网络模型,该模型凭借残差学习机制,令容易过拟合的网络预测高频残差。通过特征提取网络得到稀疏特征和全局特征,稀疏特征输入到一个超网络中生成预测浅头,该浅头负责预测符号距离函数的低频部分,而全局特征输入到另一个超网络生成另一个浅头来预测高频残差,这两部分通过权重因子构成最终的符号距离函数。频谱分析表明实验结果达到了相应的设计目的;与不同平滑表面重建方案对比,基于残差学习的平滑重建方案可以实现更平滑的表面重建,克服了SDF-SRN过拟合的问题,同时保留足够的细节;与其他先进的单视图重建方法的定性和定量对比结果证明了该方法的优越性。
Single-view 3D reconstruction based on deep learning is a research hot spot at present.In order to discover more high-frequency details, SDF-SRN algorithm introduces positional encoding, but neural network is easy to overfit without accurate supervision, and reconstructs uneven surface.To solve the problem, this paper proposed the network model based on sparse feature.The model enabled the network that preferred to overfitting to predict high-frequency residual by residual learning.The feature extraction network extracted sparse features and the global features.Then one hypernetwork took the sparse features as input and generated prediction shallow head.This shallow head predicted low-frequency part of signed distance function.Another hypernetwork took global features as input and generated another shallow head.This shallow head predicted high-frequency residual.It fused two predictions of shallow heads into final signed distance function.Spectrum analysis shows that the design purpose of network is achieved.Compared with other smooth surface reconstruction schemes, the network can achieve smoother surface reconstruction with enough details.It overcomes the overfitting of SDF-SRN.The qualitative and quantitative comparison with other advanced single-view reconstruction approaches show the superiority of the proposed approach.
作者
梁春阳
唐红梅
席建锐
刘鑫
Liang Chunyang;Tang Hongmei;Xi Jianrui;Liu Xin(School of Electronics&Information Engineering,Hebei University of Technology,Tianjin 300401,China)
出处
《计算机应用研究》
CSCD
北大核心
2023年第3期925-931,937,共8页
Application Research of Computers
基金
河北省自然科学基金资助项目(F2019202387)。
关键词
深度学习
单视图重建
符号距离函数
位置编码
超网络
残差学习
deep learning
single-view reconstruction
signed distance function
positional encoding
hypernetwork
residual learning