Traditional three-dimensional(3D)image reconstruction method,which highly dependent on the environment and has poor reconstruction effect,is easy to lead to mismatch and poor real-time performance.The accuracy of feat...Traditional three-dimensional(3D)image reconstruction method,which highly dependent on the environment and has poor reconstruction effect,is easy to lead to mismatch and poor real-time performance.The accuracy of feature extraction from multiple images affects the reliability and real-time performance of 3D reconstruction technology.To solve the problem,a multi-view image 3D reconstruction algorithm based on self-encoding convolutional neural network is proposed in this paper.The algorithm first extracts the feature information of multiple two-dimensional(2D)images based on scale and rotation invariance parameters of Scale-invariant feature transform(SIFT)operator.Secondly,self-encoding learning neural network is introduced into the feature refinement process to take full advantage of its feature extraction ability.Then,Fish-Net is used to replace the U-Net structure inside the self-encoding network to improve gradient propagation between U-Net structures,and Generative Adversarial Networks(GAN)loss function is used to replace mean square error(MSE)to better express image features,discarding useless features to obtain effective image features.Finally,an incremental structure from motion(SFM)algorithm is performed to calculate rotation matrix and translation vector of the camera,and the feature points are triangulated to obtain a sparse spatial point cloud,and meshlab software is used to display the results.Simulation experiments show that compared with the traditional method,the image feature extraction method proposed in this paper can significantly improve the rendering effect of 3D point cloud,with an accuracy rate of 92.5%and a reconstruction complete rate of 83.6%.展开更多
In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relationa...In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relational graph location network(RGLN)to perform this task.In this network,we propose a heterogeneous graph construction approach for graph classification tasks,which aims to describe the location in a more appropriate way,thereby improving the expression ability of the location representation module.Experiments show that the expression ability of the proposed graph construction approach outperforms the compared methods by a large margin.In addition,the proposed localization method outperforms the compared localization methods by around 1.7%in terms of meter-level accuracy.展开更多
In recent years,with the massive growth of image data,how to match the image required by users quickly and efficiently becomes a challenge.Compared with single-view feature,multi-view feature is more accurate to descr...In recent years,with the massive growth of image data,how to match the image required by users quickly and efficiently becomes a challenge.Compared with single-view feature,multi-view feature is more accurate to describe image information.The advantages of hash method in reducing data storage and improving efficiency also make us study how to effectively apply to large-scale image retrieval.In this paper,a hash algorithm of multi-index image retrieval based on multi-view feature coding is proposed.By learning the data correlation between different views,this algorithm uses multi-view data with deeper level image semantics to achieve better retrieval results.This algorithm uses a quantitative hash method to generate binary sequences,and uses the hash code generated by the association features to construct database inverted index files,so as to reduce the memory burden and promote the efficient matching.In order to reduce the matching error of hash code and ensure the retrieval accuracy,this algorithm uses inverted multi-index structure instead of single-index structure.Compared with other advanced image retrieval method,this method has better retrieval performance.展开更多
Subpixel localization in image center is one of the key technologies of vision measurement. In order to meet the requirements of accurate calibration and measurement in multi-field, the existing sub-pixel positioning ...Subpixel localization in image center is one of the key technologies of vision measurement. In order to meet the requirements of accurate calibration and measurement in multi-field, the existing sub-pixel positioning methods are complex, the positioning accuracy is greatly affected by the effect of initial edge extraction, and the positioning accuracy is low. Because remote sensing multi-view images are usually not stationary random signals, in order to better express the non-stationary characteristics of images, random analysis is combined to segment sub-pixel objects in the center of remote sensing images. The accuracy of mark positioning will affect the accuracy of the whole measurement. The control point signs with different characteristics correspond to different recognition methods, so the selection of control point marks should be based on different requirements. It is used to describe the target view from different viewpoints and use the geometric features to retrieve the model library. The matching process uses global and local, statistical and structural target recognition features hierarchically, and is divided into two steps of retrieval and exact matching. The experiment was carried out to verify the effectiveness of the method.展开更多
We present a threedimensional(3D)isotropic imaging of mouse brain using light-sheet fuo-rescent microscopy(LSFM)in conjumction with a multi-view imaging computation.Unlike common single view LSFM is used for mouse bra...We present a threedimensional(3D)isotropic imaging of mouse brain using light-sheet fuo-rescent microscopy(LSFM)in conjumction with a multi-view imaging computation.Unlike common single view LSFM is used for mouse brain imaging,the brain tissue is 3D imaged under eight views in our study,by a home-built selective plane ilumination microscopy(SPIM).An output image containing complete structural infornation as well as significantly improved res olution(~4 times)are then computed based on these eight views of data,using a bead-guided multi-view registration and deconvolution.With superior imaging quality,the astrocyte and pyrarmidal neurons together with their subcellular nerve fbers can be clearly visualized and segmented.With further incuding other computational methods,this study can be potentially scaled up to map the conectome of whole mouse brain with a simple light.sheet microscope.展开更多
Rapidly and accurately assessing the geometric characteristics of coarse aggregate particles is crucial for ensuring pavement performance in highway engineering.This article introduces an innovative system for the thr...Rapidly and accurately assessing the geometric characteristics of coarse aggregate particles is crucial for ensuring pavement performance in highway engineering.This article introduces an innovative system for the three-dimensional(3D)surface reconstruction of coarse aggregate particles using occlusion-free multi-view imaging.The system captures synchronized images of particles in free fall,employing a matte sphere and a nonlinear optimization approach to estimate the camera projection matrices.A pre-trained segmentation model is utilized to eliminate the background of the images.The Shape from Silhouettes(SfS)algorithm is then applied to generate 3D voxel data,followed by the Marching Cubes algorithm to construct the 3D surface contour.Validation against standard parts and diverse coarse aggregate particles confirms the method's high accuracy,with an average measurement precision of 0.434 mm and a significant increase in scanning and reconstruction efficiency.展开更多
A new algorithm is proposed for restoring disocclusion regions in depth-image-based rendering (DIBR) warped images. Current solutions include layered depth image (LDI), pre-filtering methods, and post-processing m...A new algorithm is proposed for restoring disocclusion regions in depth-image-based rendering (DIBR) warped images. Current solutions include layered depth image (LDI), pre-filtering methods, and post-processing methods. The LDI is complicated, and pre-filtering of depth images causes noticeable geometrical distortions in cases of large baseline warping. This paper presents a depth-aided inpainting method which inherits merits from Criminisi's inpainting algorithm. The proposed method features incorporation of a depth cue into texture estimation. The algorithm efficiently handles depth ambiguity by penalizing larger Lagrange multipliers of flling points closer to the warping position compared with the surrounding existing points. We perform morphological operations on depth images to accelerate the algorithm convergence, and adopt a luma-first strategy to adapt to various color sampling formats. Experiments on test multi-view sequence showed that our method has superiority in depth differentiation and geometrical loyalty in the restoration of warped images. Also, peak signal-to-noise ratio (PSNR) statistics on non-hole regions and whole image comparisons both compare favorably to those obtained by state of the art techniques.展开更多
Based on the recently proposed mirror-assisted multi-view digital image correlation(MV-DIC),we establish a cost-effective and easy-to-implement mirror-assisted multi-view high-speed digital image correlation(MVHS-DIC)...Based on the recently proposed mirror-assisted multi-view digital image correlation(MV-DIC),we establish a cost-effective and easy-to-implement mirror-assisted multi-view high-speed digital image correlation(MVHS-DIC)method and explore its applications for dual-surface full-field dynamic deformation measurement.In contrast to the general requirement of four expensive high-speed cameras for dual-surface dynamic deformation field measurement,the established mirror-assisted MVHS-DIC halves the cost by involving only two synchronized high-speed cameras and two planar mirrors.The two synchronized high-speed cameras can dynamically measure the front and rear surfaces of a sheet sample simultaneously through the reflection of the two mirrors.The results on the two surfaces are then transformed into the same coordinate system,leading to the required dual-surface 3D dynamical deformation fields.The effectiveness and accuracy of the established system are validated through modal tests of a cantilever aluminum sheet.The vibration measurement of a drum and dual-surface transient deformation measurement of a smartphone in the drop-collision process further prove its practicability.Benefiting from the attractive advantages of multi-view dynamic deformation measurement in a cost-efficient way,the established mirror-assisted MVHS-DIC is expected to encourage more comprehensive dynamic mechanical behavior characterization of regular-sized materials and structures in vibration and impact engineering fields.展开更多
In this paper, we propose a multi-kernel multi-view canonical correlations(M2CCs) framework for subspace learning. In the proposed framework,the input data of each original view are mapped into multiple higher dimensi...In this paper, we propose a multi-kernel multi-view canonical correlations(M2CCs) framework for subspace learning. In the proposed framework,the input data of each original view are mapped into multiple higher dimensional feature spaces by multiple nonlinear mappings determined by different kernels. This makes M2 CC can discover multiple kinds of useful information of each original view in the feature spaces. With the framework, we further provide a specific multi-view feature learning method based on direct summation kernel strategy and its regularized version. The experimental results in visual recognition tasks demonstrate the effectiveness and robustness of the proposed method.展开更多
The estimation of fish mass is one of the most basic and important tasks in aquaculture.Acquiring the mass of fish at different growth stages is of great significance for feeding,monitoring the health status of fish,a...The estimation of fish mass is one of the most basic and important tasks in aquaculture.Acquiring the mass of fish at different growth stages is of great significance for feeding,monitoring the health status of fish,and making breeding plans to increase production.The existing estimation methods for fish mass often stay in the 2D plane,and it is difficult to obtain the 3D information on fish,which will lead to the error.To solve this problem,a multi-view method was proposed to obtain the 3D information of fish and predict the mass of fish through a two-stage neural network with an edge-sensitive module.In the first stage,the side-and downward-view images of the fish and some 3D information,such as side area,top area,length,deflection angle,and pitch angle,were captured to estimate the size of the fish through two vertically placed cameras.Then the area of the fish at different views was estimated accurately through the pre-trained image segmentation neural network with an edgesensitive module.In the second stage,a fully connected neural network was constructed to regress the fish mass based on the 3D information obtained in the previous stage.The experimental results indicate that the proposed method can accurately estimate the fish mass and outperform the existing estimation methods.展开更多
1.Introduction The reproduction systems of 3D images without using eyeglasses and other special accessories has always attracted attention and aroused great interest of developers and consumers of such equipment becau...1.Introduction The reproduction systems of 3D images without using eyeglasses and other special accessories has always attracted attention and aroused great interest of developers and consumers of such equipment because of totally accurate image and method of its presentation.Such systems can展开更多
Novel view synthesis has attracted tremendous research attention recently for its applications in virtual reality and immersive telepresence.Rendering a locally immersive light field(LF)based on arbitrary large baseli...Novel view synthesis has attracted tremendous research attention recently for its applications in virtual reality and immersive telepresence.Rendering a locally immersive light field(LF)based on arbitrary large baseline RGB references is a challenging problem that lacks efficient solutions with existing novel view synthesis techniques.In this work,we aim at truthfully rendering local immersive novel views/LF images based on large baseline LF captures and a single RGB image in the target view.To fully explore the precious information from source LF captures,we propose a novel occlusion-aware source sampler(OSS)module which efficiently transfers the pixels of source views to the target view′s frustum in an occlusion-aware manner.An attention-based deep visual fusion module is proposed to fuse the revealed occluded background content with a preliminary LF into a final refined LF.The proposed source sampling and fusion mechanism not only helps to provide information for occluded regions from varying observation angles,but also proves to be able to effectively enhance the visual rendering quality.Experimental results show that our proposed method is able to render high-quality LF images/novel views with sparse RGB references and outperforms state-of-the-art LF rendering and novel view synthesis methods.展开更多
In order to reconstruct and render the weak and repetitive texture of the damaged functional surface of aviation,an improved neural radiance field,named TranSR-NeRF,is proposed.In this paper,a data acquisition system ...In order to reconstruct and render the weak and repetitive texture of the damaged functional surface of aviation,an improved neural radiance field,named TranSR-NeRF,is proposed.In this paper,a data acquisition system was designed and built.The acquired images generated initial point clouds through TransMVSNet.Meanwhile,after extracting features from the images through the improved SE-ConvNeXt network,the extracted features were aligned and fused with the initial point cloud to generate high-quality neural point cloud.After ray-tracing and sampling of the neural point cloud,the ResMLP neural network designed in this paper was used to regress the volume density and radiance under a given viewing angle,which introduced spatial coordinate and relative positional encoding.The reconstruction and rendering of arbitrary-scale super-resolution of damaged functional surface is realized.In this paper,the influence of illumination conditions and background environment on the model performance is also studied through experiments,and the comparison and ablation experiments for the improved methods proposed in this paper is conducted.The experimental results show that the improved model has good effect.Finally,the application experiment of object detection task is carried out,and the experimental results show that the model has good practicability.展开更多
基金This work is funded by Key Scientific Research Projects of Colleges and Universities in Henan Province under Grant 22A460022Training Plan for Young Backbone Teachers in Colleges and Universities in Henan Province under Grant 2021GGJS077.
文摘Traditional three-dimensional(3D)image reconstruction method,which highly dependent on the environment and has poor reconstruction effect,is easy to lead to mismatch and poor real-time performance.The accuracy of feature extraction from multiple images affects the reliability and real-time performance of 3D reconstruction technology.To solve the problem,a multi-view image 3D reconstruction algorithm based on self-encoding convolutional neural network is proposed in this paper.The algorithm first extracts the feature information of multiple two-dimensional(2D)images based on scale and rotation invariance parameters of Scale-invariant feature transform(SIFT)operator.Secondly,self-encoding learning neural network is introduced into the feature refinement process to take full advantage of its feature extraction ability.Then,Fish-Net is used to replace the U-Net structure inside the self-encoding network to improve gradient propagation between U-Net structures,and Generative Adversarial Networks(GAN)loss function is used to replace mean square error(MSE)to better express image features,discarding useless features to obtain effective image features.Finally,an incremental structure from motion(SFM)algorithm is performed to calculate rotation matrix and translation vector of the camera,and the feature points are triangulated to obtain a sparse spatial point cloud,and meshlab software is used to display the results.Simulation experiments show that compared with the traditional method,the image feature extraction method proposed in this paper can significantly improve the rendering effect of 3D point cloud,with an accuracy rate of 92.5%and a reconstruction complete rate of 83.6%.
文摘In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relational graph location network(RGLN)to perform this task.In this network,we propose a heterogeneous graph construction approach for graph classification tasks,which aims to describe the location in a more appropriate way,thereby improving the expression ability of the location representation module.Experiments show that the expression ability of the proposed graph construction approach outperforms the compared methods by a large margin.In addition,the proposed localization method outperforms the compared localization methods by around 1.7%in terms of meter-level accuracy.
基金supported in part by the National Natural Science Foundation of China under Grant 61772561,author J.Q,http://www.nsfc.gov.cn/in part by the Key Research and Development Plan of Hunan Province under Grant 2018NK2012,author J.Q,http://kjt.hunan.gov.cn/+7 种基金in part by the Key Research and Development Plan of Hunan Province under Grant 2019SK2022,author Y.T,http://kjt.hunan.gov.cn/in part by the Science Research Projects of Hunan Provincial Education Department under Grant 18A174,author X.X,http://kxjsc.gov.hnedu.cn/in part by the Science Research Projects of Hunan Provincial Education Department under Grant 19B584,author Y.T,http://kxjsc.gov.hnedu.cn/in part by the Degree&Postgraduate Education Reform Project of Hunan Province under Grant 2019JGYB154,author J.Q,http://xwb.gov.hnedu.cn/in part by the Postgraduate Excellent teaching team Project of Hunan Province under Grant[2019]370-133,author J.Q,http://xwb.gov.hnedu.cn/in part by the Postgraduate Education and Teaching Reform Project of Central South University of Forestry&Technology under Grant 2019JG013,author X.X,http://jwc.csuft.edu.cn/in part by the Natural Science Foundation of Hunan Province(No.2020JJ4140),author Y.T,http://kjt.hunan.gov.cn/in part by the Natural Science Foundation of Hunan Province(No.2020JJ4141),author X.X,http://kjt.hunan.gov.cn/.
文摘In recent years,with the massive growth of image data,how to match the image required by users quickly and efficiently becomes a challenge.Compared with single-view feature,multi-view feature is more accurate to describe image information.The advantages of hash method in reducing data storage and improving efficiency also make us study how to effectively apply to large-scale image retrieval.In this paper,a hash algorithm of multi-index image retrieval based on multi-view feature coding is proposed.By learning the data correlation between different views,this algorithm uses multi-view data with deeper level image semantics to achieve better retrieval results.This algorithm uses a quantitative hash method to generate binary sequences,and uses the hash code generated by the association features to construct database inverted index files,so as to reduce the memory burden and promote the efficient matching.In order to reduce the matching error of hash code and ensure the retrieval accuracy,this algorithm uses inverted multi-index structure instead of single-index structure.Compared with other advanced image retrieval method,this method has better retrieval performance.
文摘Subpixel localization in image center is one of the key technologies of vision measurement. In order to meet the requirements of accurate calibration and measurement in multi-field, the existing sub-pixel positioning methods are complex, the positioning accuracy is greatly affected by the effect of initial edge extraction, and the positioning accuracy is low. Because remote sensing multi-view images are usually not stationary random signals, in order to better express the non-stationary characteristics of images, random analysis is combined to segment sub-pixel objects in the center of remote sensing images. The accuracy of mark positioning will affect the accuracy of the whole measurement. The control point signs with different characteristics correspond to different recognition methods, so the selection of control point marks should be based on different requirements. It is used to describe the target view from different viewpoints and use the geometric features to retrieve the model library. The matching process uses global and local, statistical and structural target recognition features hierarchically, and is divided into two steps of retrieval and exact matching. The experiment was carried out to verify the effectiveness of the method.
基金funding support from 1000 Youth Talents Plan of China (P.F.)Fundamental Research Program of Shenzhen (P.F.,JCYJ20160429182424047)+1 种基金National Science Foundation of China (NSFC31571002,D.Z)Graduates'Innovation Fund of Huazhong University of Science and Technology (5003182004).
文摘We present a threedimensional(3D)isotropic imaging of mouse brain using light-sheet fuo-rescent microscopy(LSFM)in conjumction with a multi-view imaging computation.Unlike common single view LSFM is used for mouse brain imaging,the brain tissue is 3D imaged under eight views in our study,by a home-built selective plane ilumination microscopy(SPIM).An output image containing complete structural infornation as well as significantly improved res olution(~4 times)are then computed based on these eight views of data,using a bead-guided multi-view registration and deconvolution.With superior imaging quality,the astrocyte and pyrarmidal neurons together with their subcellular nerve fbers can be clearly visualized and segmented.With further incuding other computational methods,this study can be potentially scaled up to map the conectome of whole mouse brain with a simple light.sheet microscope.
基金Supported by the Key R&D Projects in Shaanxi Province(2022JBGS3-08)。
文摘Rapidly and accurately assessing the geometric characteristics of coarse aggregate particles is crucial for ensuring pavement performance in highway engineering.This article introduces an innovative system for the three-dimensional(3D)surface reconstruction of coarse aggregate particles using occlusion-free multi-view imaging.The system captures synchronized images of particles in free fall,employing a matte sphere and a nonlinear optimization approach to estimate the camera projection matrices.A pre-trained segmentation model is utilized to eliminate the background of the images.The Shape from Silhouettes(SfS)algorithm is then applied to generate 3D voxel data,followed by the Marching Cubes algorithm to construct the 3D surface contour.Validation against standard parts and diverse coarse aggregate particles confirms the method's high accuracy,with an average measurement precision of 0.434 mm and a significant increase in scanning and reconstruction efficiency.
基金Project supported by the National Natural Science Foundation of China (No 60802013)the Natural Science Foundation of Zhe-jiang Province, China (No Y106574)
文摘A new algorithm is proposed for restoring disocclusion regions in depth-image-based rendering (DIBR) warped images. Current solutions include layered depth image (LDI), pre-filtering methods, and post-processing methods. The LDI is complicated, and pre-filtering of depth images causes noticeable geometrical distortions in cases of large baseline warping. This paper presents a depth-aided inpainting method which inherits merits from Criminisi's inpainting algorithm. The proposed method features incorporation of a depth cue into texture estimation. The algorithm efficiently handles depth ambiguity by penalizing larger Lagrange multipliers of flling points closer to the warping position compared with the surrounding existing points. We perform morphological operations on depth images to accelerate the algorithm convergence, and adopt a luma-first strategy to adapt to various color sampling formats. Experiments on test multi-view sequence showed that our method has superiority in depth differentiation and geometrical loyalty in the restoration of warped images. Also, peak signal-to-noise ratio (PSNR) statistics on non-hole regions and whole image comparisons both compare favorably to those obtained by state of the art techniques.
基金supported by the National Natural Science Foundation of China(Grant Nos.11925202 and 11872009)National Science and Technology Major Project(Grant No.J2019-V-0006-0099)。
文摘Based on the recently proposed mirror-assisted multi-view digital image correlation(MV-DIC),we establish a cost-effective and easy-to-implement mirror-assisted multi-view high-speed digital image correlation(MVHS-DIC)method and explore its applications for dual-surface full-field dynamic deformation measurement.In contrast to the general requirement of four expensive high-speed cameras for dual-surface dynamic deformation field measurement,the established mirror-assisted MVHS-DIC halves the cost by involving only two synchronized high-speed cameras and two planar mirrors.The two synchronized high-speed cameras can dynamically measure the front and rear surfaces of a sheet sample simultaneously through the reflection of the two mirrors.The results on the two surfaces are then transformed into the same coordinate system,leading to the required dual-surface 3D dynamical deformation fields.The effectiveness and accuracy of the established system are validated through modal tests of a cantilever aluminum sheet.The vibration measurement of a drum and dual-surface transient deformation measurement of a smartphone in the drop-collision process further prove its practicability.Benefiting from the attractive advantages of multi-view dynamic deformation measurement in a cost-efficient way,the established mirror-assisted MVHS-DIC is expected to encourage more comprehensive dynamic mechanical behavior characterization of regular-sized materials and structures in vibration and impact engineering fields.
基金supported by the National Natural Science Foundation of China under Grant Nos. 61402203, 61273251, and 61170120the Fundamental Research Funds for the Central Universities under Grant No. JUSRP11458the Program for New Century Excellent Talents in University under Grant No. NCET-12-0881
文摘In this paper, we propose a multi-kernel multi-view canonical correlations(M2CCs) framework for subspace learning. In the proposed framework,the input data of each original view are mapped into multiple higher dimensional feature spaces by multiple nonlinear mappings determined by different kernels. This makes M2 CC can discover multiple kinds of useful information of each original view in the feature spaces. With the framework, we further provide a specific multi-view feature learning method based on direct summation kernel strategy and its regularized version. The experimental results in visual recognition tasks demonstrate the effectiveness and robustness of the proposed method.
基金funded by Guangdong Provincial Natural Science Foundation General Project(Grant No.2023A1515011700)GuangDong Basic and Applied Basic Research Foundation(Grant No.2022A1515110007)+1 种基金the Guangdong Provincial Natural Science Foundation General Project(Grant No.2023A1515012869)GDAS'Project of Science and Technology Development(Grant No.2022GDASZH-2022010108).
文摘The estimation of fish mass is one of the most basic and important tasks in aquaculture.Acquiring the mass of fish at different growth stages is of great significance for feeding,monitoring the health status of fish,and making breeding plans to increase production.The existing estimation methods for fish mass often stay in the 2D plane,and it is difficult to obtain the 3D information on fish,which will lead to the error.To solve this problem,a multi-view method was proposed to obtain the 3D information of fish and predict the mass of fish through a two-stage neural network with an edge-sensitive module.In the first stage,the side-and downward-view images of the fish and some 3D information,such as side area,top area,length,deflection angle,and pitch angle,were captured to estimate the size of the fish through two vertically placed cameras.Then the area of the fish at different views was estimated accurately through the pre-trained image segmentation neural network with an edgesensitive module.In the second stage,a fully connected neural network was constructed to regress the fish mass based on the 3D information obtained in the previous stage.The experimental results indicate that the proposed method can accurately estimate the fish mass and outperform the existing estimation methods.
文摘1.Introduction The reproduction systems of 3D images without using eyeglasses and other special accessories has always attracted attention and aroused great interest of developers and consumers of such equipment because of totally accurate image and method of its presentation.Such systems can
基金the Theme-based Research Scheme,Research Grants Council of Hong Kong(No.T45-205/21-N).
文摘Novel view synthesis has attracted tremendous research attention recently for its applications in virtual reality and immersive telepresence.Rendering a locally immersive light field(LF)based on arbitrary large baseline RGB references is a challenging problem that lacks efficient solutions with existing novel view synthesis techniques.In this work,we aim at truthfully rendering local immersive novel views/LF images based on large baseline LF captures and a single RGB image in the target view.To fully explore the precious information from source LF captures,we propose a novel occlusion-aware source sampler(OSS)module which efficiently transfers the pixels of source views to the target view′s frustum in an occlusion-aware manner.An attention-based deep visual fusion module is proposed to fuse the revealed occluded background content with a preliminary LF into a final refined LF.The proposed source sampling and fusion mechanism not only helps to provide information for occluded regions from varying observation angles,but also proves to be able to effectively enhance the visual rendering quality.Experimental results show that our proposed method is able to render high-quality LF images/novel views with sparse RGB references and outperforms state-of-the-art LF rendering and novel view synthesis methods.
基金supported by the National Science and Technology Major Project,China(No.J2019-Ⅲ-0009-0053)the National Natural Science Foundation of China(No.12075319)。
文摘In order to reconstruct and render the weak and repetitive texture of the damaged functional surface of aviation,an improved neural radiance field,named TranSR-NeRF,is proposed.In this paper,a data acquisition system was designed and built.The acquired images generated initial point clouds through TransMVSNet.Meanwhile,after extracting features from the images through the improved SE-ConvNeXt network,the extracted features were aligned and fused with the initial point cloud to generate high-quality neural point cloud.After ray-tracing and sampling of the neural point cloud,the ResMLP neural network designed in this paper was used to regress the volume density and radiance under a given viewing angle,which introduced spatial coordinate and relative positional encoding.The reconstruction and rendering of arbitrary-scale super-resolution of damaged functional surface is realized.In this paper,the influence of illumination conditions and background environment on the model performance is also studied through experiments,and the comparison and ablation experiments for the improved methods proposed in this paper is conducted.The experimental results show that the improved model has good effect.Finally,the application experiment of object detection task is carried out,and the experimental results show that the model has good practicability.