Traditional neural radiance fields for rendering novel views require intensive input images and pre-scene optimization,which limits their practical applications.We propose a generalization method to infer scenes from ...Traditional neural radiance fields for rendering novel views require intensive input images and pre-scene optimization,which limits their practical applications.We propose a generalization method to infer scenes from input images and perform high-quality rendering without pre-scene optimization named SG-NeRF(Sparse-Input Generalized Neural Radiance Fields).Firstly,we construct an improved multi-view stereo structure based on the convolutional attention and multi-level fusion mechanism to obtain the geometric features and appearance features of the scene from the sparse input images,and then these features are aggregated by multi-head attention as the input of the neural radiance fields.This strategy of utilizing neural radiance fields to decode scene features instead of mapping positions and orientations enables our method to perform cross-scene training as well as inference,thus enabling neural radiance fields to generalize for novel view synthesis on unseen scenes.We tested the generalization ability on DTU dataset,and our PSNR(peak signal-to-noise ratio)improved by 3.14 compared with the baseline method under the same input conditions.In addition,if the scene has dense input views available,the average PSNR can be improved by 1.04 through further refinement training in a short time,and a higher quality rendering effect can be obtained.展开更多
Deep neural networks(DNNs)have been extensively studied in medical image segmentation.However,existing DNNs often need to train shape models for each object to be segmented,which may yield results that violate cardiac...Deep neural networks(DNNs)have been extensively studied in medical image segmentation.However,existing DNNs often need to train shape models for each object to be segmented,which may yield results that violate cardiac anatomical structure when segmenting cardiac magnetic resonance imaging(MRI).In this paper,we propose a capsulebased neural network,named Seg-CapNet,to model multiple regions simultaneously within a single training process.The Seg-CapNet model consists of the encoder and the decoder.The encoder transforms the input image into feature vectors that represent objects to be segmented by convolutional layers,capsule layers,and fully-connected layers.And the decoder transforms the feature vectors into segmentation masks by up-sampling.Feature maps of each down-sampling layer in the encoder are connected to the corresponding up-sampling layers,which are conducive to the backpropagation of the model.The output vectors of Seg-CapNet contain low-level image features such as grayscale and texture,as well as semantic features including the position and size of the objects,which is beneficial for improving the segmentation accuracy.The proposed model is validated on the open dataset of the Automated Cardiac Diagnosis Challenge 2017(ACDC 2017)and the Sunnybrook Cardiac Magnetic Resonance Imaging(MRI)segmentation challenge.Experimental results show that the mean Dice coefficient of Seg-CapNet is increased by 4.7%and the average Hausdorff distance is reduced by 22%.The proposed model also reduces the model parameters and improves the training speed while obtaining the accurate segmentation of multiple regions.展开更多
基金supported by the Zhengzhou Collaborative Innovation Major Project under Grant No.20XTZX06013the Henan Provincial Key Scientific Research Project of China under Grant No.22A520042。
文摘Traditional neural radiance fields for rendering novel views require intensive input images and pre-scene optimization,which limits their practical applications.We propose a generalization method to infer scenes from input images and perform high-quality rendering without pre-scene optimization named SG-NeRF(Sparse-Input Generalized Neural Radiance Fields).Firstly,we construct an improved multi-view stereo structure based on the convolutional attention and multi-level fusion mechanism to obtain the geometric features and appearance features of the scene from the sparse input images,and then these features are aggregated by multi-head attention as the input of the neural radiance fields.This strategy of utilizing neural radiance fields to decode scene features instead of mapping positions and orientations enables our method to perform cross-scene training as well as inference,thus enabling neural radiance fields to generalize for novel view synthesis on unseen scenes.We tested the generalization ability on DTU dataset,and our PSNR(peak signal-to-noise ratio)improved by 3.14 compared with the baseline method under the same input conditions.In addition,if the scene has dense input views available,the average PSNR can be improved by 1.04 through further refinement training in a short time,and a higher quality rendering effect can be obtained.
基金This work was supported by the Collaborative Innovation Major Project of Zhengzhou under Grant No.20XTZX06013the National Natural Science Foundation of China under Grant No.61932014.
文摘Deep neural networks(DNNs)have been extensively studied in medical image segmentation.However,existing DNNs often need to train shape models for each object to be segmented,which may yield results that violate cardiac anatomical structure when segmenting cardiac magnetic resonance imaging(MRI).In this paper,we propose a capsulebased neural network,named Seg-CapNet,to model multiple regions simultaneously within a single training process.The Seg-CapNet model consists of the encoder and the decoder.The encoder transforms the input image into feature vectors that represent objects to be segmented by convolutional layers,capsule layers,and fully-connected layers.And the decoder transforms the feature vectors into segmentation masks by up-sampling.Feature maps of each down-sampling layer in the encoder are connected to the corresponding up-sampling layers,which are conducive to the backpropagation of the model.The output vectors of Seg-CapNet contain low-level image features such as grayscale and texture,as well as semantic features including the position and size of the objects,which is beneficial for improving the segmentation accuracy.The proposed model is validated on the open dataset of the Automated Cardiac Diagnosis Challenge 2017(ACDC 2017)and the Sunnybrook Cardiac Magnetic Resonance Imaging(MRI)segmentation challenge.Experimental results show that the mean Dice coefficient of Seg-CapNet is increased by 4.7%and the average Hausdorff distance is reduced by 22%.The proposed model also reduces the model parameters and improves the training speed while obtaining the accurate segmentation of multiple regions.