Eye center localization is one of the most crucial and basic requirements for some human-computer interaction applications such as eye gaze estimation and eye tracking. There is a large body of works on this topic in ...Eye center localization is one of the most crucial and basic requirements for some human-computer interaction applications such as eye gaze estimation and eye tracking. There is a large body of works on this topic in recent years, but the accuracy still needs to be improved due to challenges in appearance such as the high variability of shapes, lighting conditions, viewing angles and possible occlusions. To address these problems and limitations, we propose a novel approach in this paper for the eye center localization with a fully convolutional network(FCN),which is an end-to-end and pixels-to-pixels network and can locate the eye center accurately. The key idea is to apply the FCN from the object semantic segmentation task to the eye center localization task since the problem of eye center localization can be regarded as a special semantic segmentation problem. We adapt contemporary FCN into a shallow structure with a large kernel convolutional block and transfer their performance from semantic segmentation to the eye center localization task by fine-tuning.Extensive experiments show that the proposed method outperforms the state-of-the-art methods in both accuracy and reliability of eye center localization. The proposed method has achieved a large performance improvement on the most challenging database and it thus provides a promising solution to some challenging applications.展开更多
AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segment...AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.展开更多
Convolution neural networks(CNNs)have proven to be effective clinical imagingmethods.This study highlighted some of the key issues within these systems.It is difficult to train these systems in a limited clinical imag...Convolution neural networks(CNNs)have proven to be effective clinical imagingmethods.This study highlighted some of the key issues within these systems.It is difficult to train these systems in a limited clinical image databases,and many publications present strategies including such learning algorithm.Furthermore,these patterns are known formaking a highly reliable prognosis.In addition,normalization of volume and losses of dice have been used effectively to accelerate and stabilize the training.Furthermore,these systems are improperly regulated,resulting in more confident ratings for correct and incorrect classification,which are inaccurate and difficult to understand.This study examines the risk assessment of Fully Convolutional Neural Networks(FCNNs)for clinical image segmentation.Essential contributions have been made to this planned work:1)dice loss and cross-entropy loss are compared on the basis of segment quality and uncertain assessment of FCNNs;2)proposal for a group model for assurance measurement of full convolutional neural networks trained with dice loss and group normalization;And 3)the ability of the measured FCNs to evaluate the segment quality of the structures and to identify test examples outside the distribution.To evaluate the study’s contributions,it conducted a series of tests in three clinical image division applications such as heart,brain and prostate.The findings of the study provide significant insights into the predictive ambiguity assessment and a practical strategies for outside-distribution identification and reliable measurement in the clinical image segmentation.The approaches presented in this research significantly enhance the reliability and accuracy rating of CNNbased clinical imaging methods.展开更多
Panoramic images are widely used in many scenes,especially in virtual reality and street view capture.However,they are new for street furniture identification which is usually based on mobile laser scanning point clou...Panoramic images are widely used in many scenes,especially in virtual reality and street view capture.However,they are new for street furniture identification which is usually based on mobile laser scanning point cloud data or conventional 2D images.This study proposes to perform semantic segmentation on panoramic images and transformed images to separate light poles and traffic signs from background implemented by pre-trained Fully Convolutional Networks(FCN).FCN is the most important model for deep learning applied on semantic segmentation for its end to end training process and pixel-wise prediction.In this study,we use FCN-8s model that pre-trained on cityscape dataset and finetune it by our own data.Then replace cross entropy loss function with focal loss function in the FCN model and train it again to produce the predictions.The results show that in all results from pre-trained model,fine-tuning,and FCN model with focal loss,the light poles and traffic signs are detected well and the transformed images have better performance than panoramic images in the prediction according to the Recall and IoU evaluation.展开更多
Accurate boundaries of smallholder farm fields are important and indispensable geo-information that benefits farmers,managers,and policymakers in terms of better managing and utilizing their agricultural resources.Due...Accurate boundaries of smallholder farm fields are important and indispensable geo-information that benefits farmers,managers,and policymakers in terms of better managing and utilizing their agricultural resources.Due to their small size,irregular shape,and the use of mixed-cropping techniques,the farm fields of smallholder can be difficult to delineate automatically.In recent years,numerous studies on field contour extraction using a deep Convolutional Neural Network(CNN)have been proposed.However,there is a relative shortage of labeled data for filed boundaries,thus affecting the training effect of CNN.Traditional methods mostly use image flipping,and random rotation for data augmentation.In this paper,we propose to apply Generative Adversarial Network(GAN)for the data augmentation of farm fields label to increase the diversity of samples.Specifically,we propose an automated method featured by Fully Convolutional Neural networks(FCN)in combination with GAN to improve the delineation accuracy of smallholder farms from Very High Resolution(VHR)images.We first investigate four State-Of-The-Art(SOTA)FCN architectures,i.e.,U-Net,PSPNet,SegNet and OCRNet,to find the optimal architecture in the contour detection task of smallholder farm fields.Second,we apply the identified optimal FCN architecture in combination with Contour GAN and pixel2pixel GAN to improve the accuracy of contour detection.We test our method on the study area in the Sudano-Sahelian savanna region of northern Nigeria.The best combination achieved F1 scores of 0.686 on Test Set 1(TS1),0.684 on Test Set 2(TS2),and 0.691 on Test Set 3(TS3).Results indicate that our architecture adapts to a variety of advanced networks and proves its effectiveness in this task.The conceptual,theoretical,and experimental knowledge from this study is expected to seed many GAN-based farm delineation methods in the future.展开更多
The crack is a common pavement failure problem.A lack of periodic maintenance will result in extending the cracks and damage the pavement,which will affect the normal use of the road.Therefore,it is significant to est...The crack is a common pavement failure problem.A lack of periodic maintenance will result in extending the cracks and damage the pavement,which will affect the normal use of the road.Therefore,it is significant to establish an efficient intelligent identification model for pavement cracks.The neural network is a method of simulating animal nervous systems using gradient descent to predict results by learning a weight matrix.It has been widely used in geotechnical engineering,computer vision,medicine,and other fields.However,there are three major problems in the application of neural networks to crack identification.There are too few layers,extracted crack features are not complete,and the method lacks the efficiency to calculate the whole picture.In this study,a fully convolutional neural network based on ResNet-101 is used to establish an intelligent identification model of pavement crack regions.This method,using a convolutional layer instead of a fully connected layer,realizes full convolution and accelerates calculation.The region proposals come from the feature map at the end of the base network,which avoids multiple computations of the same picture.Online hard example mining and data-augmentation techniques are adopted to improve the model’s recognition accuracy.We trained and tested Concrete Crack Images for Classification(CCIC),which is a public dataset collected using smartphones,and the Crack Image Database(CIDB),which was automatically collected using vehicle-mounted charge-coupled device cameras,with identification accuracy reaching 91.4%and 86.4%,respectively.The proposed model has a higher recognition accuracy and recall rate than Faster RCNN and different depth models,and can extract more complete and accurate crack features in CIDB.We also analyzed translation processing,fuzzy,scaling,and distorted images.The proposed model shows a strong robustness and stability,and can automatically identify image cracks of different forms.It has broad application prospects in practical engineering problems.展开更多
As one chemical composition,nicotine content has an important influence on the quality of tobacco leaves.Rapid and nondestructive quantitative analysis of nicotine is an important task in the tobacco industry.Near-inf...As one chemical composition,nicotine content has an important influence on the quality of tobacco leaves.Rapid and nondestructive quantitative analysis of nicotine is an important task in the tobacco industry.Near-infrared(NIR)spectroscopy as an effective chemical composition analysis technique has been widely used.In this paper,we propose a one-dimensional fully convolutional network(1D-FCN)model to quantitatively analyze the nicotine composition of tobacco leaves using NIR spectroscopy data in a cloud environment.This 1D-FCN model uses one-dimensional convolution layers to directly extract the complex features from sequential spectroscopy data.It consists of five convolutional layers and two full connection layers with the max-pooling layer replaced by a convolutional layer to avoid information loss.Cloud computing techniques are used to solve the increasing requests of large-size data analysis and implement data sharing and accessing.Experimental results show that the proposed 1D-FCN model can effectively extract the complex characteristics inside the spectrum and more accurately predict the nicotine volumes in tobacco leaves than other approaches.This research provides a deep learning foundation for quantitative analysis of NIR spectral data in the tobacco industry.展开更多
In this paper, the complete process of constructing 3D digital core by fullconvolutional neural network is described carefully. A large number of sandstone computedtomography (CT) images are used as training input for...In this paper, the complete process of constructing 3D digital core by fullconvolutional neural network is described carefully. A large number of sandstone computedtomography (CT) images are used as training input for a fully convolutional neural networkmodel. This model is used to reconstruct the three-dimensional (3D) digital core of Bereasandstone based on a small number of CT images. The Hamming distance together with theMinkowski functions for porosity, average volume specifi c surface area, average curvature,and connectivity of both the real core and the digital reconstruction are used to evaluate theaccuracy of the proposed method. The results show that the reconstruction achieved relativeerrors of 6.26%, 1.40%, 6.06%, and 4.91% for the four Minkowski functions and a Hammingdistance of 0.04479. This demonstrates that the proposed method can not only reconstructthe physical properties of real sandstone but can also restore the real characteristics of poredistribution in sandstone, is the ability to which is a new way to characterize the internalmicrostructure of rocks.展开更多
基金supported by National Natural Science Foundation of China(61533019,U1811463)Open Fund of the State Key Laboratory for Management and Control of Complex Systems,Institute of Automation,Chinese Academy of Sciences(Y6S9011F51)in part by the EPSRC Project(EP/N025849/1)
文摘Eye center localization is one of the most crucial and basic requirements for some human-computer interaction applications such as eye gaze estimation and eye tracking. There is a large body of works on this topic in recent years, but the accuracy still needs to be improved due to challenges in appearance such as the high variability of shapes, lighting conditions, viewing angles and possible occlusions. To address these problems and limitations, we propose a novel approach in this paper for the eye center localization with a fully convolutional network(FCN),which is an end-to-end and pixels-to-pixels network and can locate the eye center accurately. The key idea is to apply the FCN from the object semantic segmentation task to the eye center localization task since the problem of eye center localization can be regarded as a special semantic segmentation problem. We adapt contemporary FCN into a shallow structure with a large kernel convolutional block and transfer their performance from semantic segmentation to the eye center localization task by fine-tuning.Extensive experiments show that the proposed method outperforms the state-of-the-art methods in both accuracy and reliability of eye center localization. The proposed method has achieved a large performance improvement on the most challenging database and it thus provides a promising solution to some challenging applications.
基金Supported by National Science Foundation of China(No.81800878)Interdisciplinary Program of Shanghai Jiao Tong University(No.YG2017QN24)+1 种基金Key Technological Research Projects of Songjiang District(No.18sjkjgg24)Bethune Langmu Ophthalmological Research Fund for Young and Middle-aged People(No.BJ-LM2018002J)
文摘AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.
文摘Convolution neural networks(CNNs)have proven to be effective clinical imagingmethods.This study highlighted some of the key issues within these systems.It is difficult to train these systems in a limited clinical image databases,and many publications present strategies including such learning algorithm.Furthermore,these patterns are known formaking a highly reliable prognosis.In addition,normalization of volume and losses of dice have been used effectively to accelerate and stabilize the training.Furthermore,these systems are improperly regulated,resulting in more confident ratings for correct and incorrect classification,which are inaccurate and difficult to understand.This study examines the risk assessment of Fully Convolutional Neural Networks(FCNNs)for clinical image segmentation.Essential contributions have been made to this planned work:1)dice loss and cross-entropy loss are compared on the basis of segment quality and uncertain assessment of FCNNs;2)proposal for a group model for assurance measurement of full convolutional neural networks trained with dice loss and group normalization;And 3)the ability of the measured FCNs to evaluate the segment quality of the structures and to identify test examples outside the distribution.To evaluate the study’s contributions,it conducted a series of tests in three clinical image division applications such as heart,brain and prostate.The findings of the study provide significant insights into the predictive ambiguity assessment and a practical strategies for outside-distribution identification and reliable measurement in the clinical image segmentation.The approaches presented in this research significantly enhance the reliability and accuracy rating of CNNbased clinical imaging methods.
文摘Panoramic images are widely used in many scenes,especially in virtual reality and street view capture.However,they are new for street furniture identification which is usually based on mobile laser scanning point cloud data or conventional 2D images.This study proposes to perform semantic segmentation on panoramic images and transformed images to separate light poles and traffic signs from background implemented by pre-trained Fully Convolutional Networks(FCN).FCN is the most important model for deep learning applied on semantic segmentation for its end to end training process and pixel-wise prediction.In this study,we use FCN-8s model that pre-trained on cityscape dataset and finetune it by our own data.Then replace cross entropy loss function with focal loss function in the FCN model and train it again to produce the predictions.The results show that in all results from pre-trained model,fine-tuning,and FCN model with focal loss,the light poles and traffic signs are detected well and the transformed images have better performance than panoramic images in the prediction according to the Recall and IoU evaluation.
基金Foundation of Anhui Province Key Laboratory of Physical Geographic Environment(No.2022PGE012)
文摘Accurate boundaries of smallholder farm fields are important and indispensable geo-information that benefits farmers,managers,and policymakers in terms of better managing and utilizing their agricultural resources.Due to their small size,irregular shape,and the use of mixed-cropping techniques,the farm fields of smallholder can be difficult to delineate automatically.In recent years,numerous studies on field contour extraction using a deep Convolutional Neural Network(CNN)have been proposed.However,there is a relative shortage of labeled data for filed boundaries,thus affecting the training effect of CNN.Traditional methods mostly use image flipping,and random rotation for data augmentation.In this paper,we propose to apply Generative Adversarial Network(GAN)for the data augmentation of farm fields label to increase the diversity of samples.Specifically,we propose an automated method featured by Fully Convolutional Neural networks(FCN)in combination with GAN to improve the delineation accuracy of smallholder farms from Very High Resolution(VHR)images.We first investigate four State-Of-The-Art(SOTA)FCN architectures,i.e.,U-Net,PSPNet,SegNet and OCRNet,to find the optimal architecture in the contour detection task of smallholder farm fields.Second,we apply the identified optimal FCN architecture in combination with Contour GAN and pixel2pixel GAN to improve the accuracy of contour detection.We test our method on the study area in the Sudano-Sahelian savanna region of northern Nigeria.The best combination achieved F1 scores of 0.686 on Test Set 1(TS1),0.684 on Test Set 2(TS2),and 0.691 on Test Set 3(TS3).Results indicate that our architecture adapts to a variety of advanced networks and proves its effectiveness in this task.The conceptual,theoretical,and experimental knowledge from this study is expected to seed many GAN-based farm delineation methods in the future.
基金funded by the National Key Research and Development Program of China(No.2017YFC1501200)the National Natural Science Foundation of China(Nos.51678536,41404096)+2 种基金supported by Department of education’s Production-Study-Research combined innovation Funding-“Blue fire plan(Huizhou)”(CXZJHZ01742)the Program for Science and Technology Innovation Talents in Universities of Henan Province(Grant No.19HASTIT043)the Outstanding Young Talent Research Fund of Zhengzhou University(1621323001).
文摘The crack is a common pavement failure problem.A lack of periodic maintenance will result in extending the cracks and damage the pavement,which will affect the normal use of the road.Therefore,it is significant to establish an efficient intelligent identification model for pavement cracks.The neural network is a method of simulating animal nervous systems using gradient descent to predict results by learning a weight matrix.It has been widely used in geotechnical engineering,computer vision,medicine,and other fields.However,there are three major problems in the application of neural networks to crack identification.There are too few layers,extracted crack features are not complete,and the method lacks the efficiency to calculate the whole picture.In this study,a fully convolutional neural network based on ResNet-101 is used to establish an intelligent identification model of pavement crack regions.This method,using a convolutional layer instead of a fully connected layer,realizes full convolution and accelerates calculation.The region proposals come from the feature map at the end of the base network,which avoids multiple computations of the same picture.Online hard example mining and data-augmentation techniques are adopted to improve the model’s recognition accuracy.We trained and tested Concrete Crack Images for Classification(CCIC),which is a public dataset collected using smartphones,and the Crack Image Database(CIDB),which was automatically collected using vehicle-mounted charge-coupled device cameras,with identification accuracy reaching 91.4%and 86.4%,respectively.The proposed model has a higher recognition accuracy and recall rate than Faster RCNN and different depth models,and can extract more complete and accurate crack features in CIDB.We also analyzed translation processing,fuzzy,scaling,and distorted images.The proposed model shows a strong robustness and stability,and can automatically identify image cracks of different forms.It has broad application prospects in practical engineering problems.
文摘As one chemical composition,nicotine content has an important influence on the quality of tobacco leaves.Rapid and nondestructive quantitative analysis of nicotine is an important task in the tobacco industry.Near-infrared(NIR)spectroscopy as an effective chemical composition analysis technique has been widely used.In this paper,we propose a one-dimensional fully convolutional network(1D-FCN)model to quantitatively analyze the nicotine composition of tobacco leaves using NIR spectroscopy data in a cloud environment.This 1D-FCN model uses one-dimensional convolution layers to directly extract the complex features from sequential spectroscopy data.It consists of five convolutional layers and two full connection layers with the max-pooling layer replaced by a convolutional layer to avoid information loss.Cloud computing techniques are used to solve the increasing requests of large-size data analysis and implement data sharing and accessing.Experimental results show that the proposed 1D-FCN model can effectively extract the complex characteristics inside the spectrum and more accurately predict the nicotine volumes in tobacco leaves than other approaches.This research provides a deep learning foundation for quantitative analysis of NIR spectral data in the tobacco industry.
基金the National Natural Science Foundation of China(No.41274129)Chuan Qing Drilling Engineering Company's Scientific Research Project:Seismic detection technology and application of complex carbonate reservoir in Sulige Majiagou Formation and the 2018 Central Supporting Local Co-construction Fund(No.80000-18Z0140504)the Construction and Development of Universities in 2019-Joint Support for Geophysics(Double First-Class center,80000-19Z0204)。
文摘In this paper, the complete process of constructing 3D digital core by fullconvolutional neural network is described carefully. A large number of sandstone computedtomography (CT) images are used as training input for a fully convolutional neural networkmodel. This model is used to reconstruct the three-dimensional (3D) digital core of Bereasandstone based on a small number of CT images. The Hamming distance together with theMinkowski functions for porosity, average volume specifi c surface area, average curvature,and connectivity of both the real core and the digital reconstruction are used to evaluate theaccuracy of the proposed method. The results show that the reconstruction achieved relativeerrors of 6.26%, 1.40%, 6.06%, and 4.91% for the four Minkowski functions and a Hammingdistance of 0.04479. This demonstrates that the proposed method can not only reconstructthe physical properties of real sandstone but can also restore the real characteristics of poredistribution in sandstone, is the ability to which is a new way to characterize the internalmicrostructure of rocks.