Panoramic images are widely used in many scenes,especially in virtual reality and street view capture.However,they are new for street furniture identification which is usually based on mobile laser scanning point clou...Panoramic images are widely used in many scenes,especially in virtual reality and street view capture.However,they are new for street furniture identification which is usually based on mobile laser scanning point cloud data or conventional 2D images.This study proposes to perform semantic segmentation on panoramic images and transformed images to separate light poles and traffic signs from background implemented by pre-trained Fully Convolutional Networks(FCN).FCN is the most important model for deep learning applied on semantic segmentation for its end to end training process and pixel-wise prediction.In this study,we use FCN-8s model that pre-trained on cityscape dataset and finetune it by our own data.Then replace cross entropy loss function with focal loss function in the FCN model and train it again to produce the predictions.The results show that in all results from pre-trained model,fine-tuning,and FCN model with focal loss,the light poles and traffic signs are detected well and the transformed images have better performance than panoramic images in the prediction according to the Recall and IoU evaluation.展开更多
Accurate boundaries of smallholder farm fields are important and indispensable geo-information that benefits farmers,managers,and policymakers in terms of better managing and utilizing their agricultural resources.Due...Accurate boundaries of smallholder farm fields are important and indispensable geo-information that benefits farmers,managers,and policymakers in terms of better managing and utilizing their agricultural resources.Due to their small size,irregular shape,and the use of mixed-cropping techniques,the farm fields of smallholder can be difficult to delineate automatically.In recent years,numerous studies on field contour extraction using a deep Convolutional Neural Network(CNN)have been proposed.However,there is a relative shortage of labeled data for filed boundaries,thus affecting the training effect of CNN.Traditional methods mostly use image flipping,and random rotation for data augmentation.In this paper,we propose to apply Generative Adversarial Network(GAN)for the data augmentation of farm fields label to increase the diversity of samples.Specifically,we propose an automated method featured by Fully Convolutional Neural networks(FCN)in combination with GAN to improve the delineation accuracy of smallholder farms from Very High Resolution(VHR)images.We first investigate four State-Of-The-Art(SOTA)FCN architectures,i.e.,U-Net,PSPNet,SegNet and OCRNet,to find the optimal architecture in the contour detection task of smallholder farm fields.Second,we apply the identified optimal FCN architecture in combination with Contour GAN and pixel2pixel GAN to improve the accuracy of contour detection.We test our method on the study area in the Sudano-Sahelian savanna region of northern Nigeria.The best combination achieved F1 scores of 0.686 on Test Set 1(TS1),0.684 on Test Set 2(TS2),and 0.691 on Test Set 3(TS3).Results indicate that our architecture adapts to a variety of advanced networks and proves its effectiveness in this task.The conceptual,theoretical,and experimental knowledge from this study is expected to seed many GAN-based farm delineation methods in the future.展开更多
The separation of individual pigs from the pigpen scenes is crucial for precision farming,and the technology based on convolutional neural networks can provide a low-cost,non-contact,non-invasive method of pig image s...The separation of individual pigs from the pigpen scenes is crucial for precision farming,and the technology based on convolutional neural networks can provide a low-cost,non-contact,non-invasive method of pig image segmentation.However,two factors limit the development of this field.On the one hand,the individual pigs are easy to stick together,and the occlusion of debris such as pigpens can easily make the model misjudgment.On the other hand,manual labeling of group-raised pig data is time-consuming and labor-intensive and is prone to labeling errors.Therefore,it is urgent for an individual pig image segmentation model that can perform well in individual scenarios and can be easily migrated to a group-raised environment.In order to solve the above problems,taking individual pigs as research objects,an individual pig image segmentation dataset containing 2066 images was constructed,and a series of algorithms based on fully convolutional networks were proposed to solve the pig image segmentation problem.In order to capture the long-range dependencies and weaken the background information such as pigpens while enhancing the information of individual parts of pigs,the channel and spatial attention blocks were introduced into the best-performing decoders UNet and LinkNet.Experiments show that using ResNext50 as the encoder and Unet as the decoder as the basic model,adding two attention blocks at the same time achieves 98.30%and 96.71%on the F1 and IOU metrics,respectively.Compared with the model adding channel attention block alone,the two metrics are improved by 0.13%and 0.22%,respectively.The experiment of introducing channel and spatial attention alone shows that spatial attention is more effective than channel attention.Taking VGG16-LinkNet as an example,compared with channel attention,spatial attention improves the F1 and IOU metrics by 0.16%and 0.30%,respectively.Furthermore,the heatmap of the feature of different layers of the decoder after adding different attention information proves that with the increase of layers,the boundary of pig image segmentation is clearer.In order to verify the effectiveness of the individual pig image segmentation model in group-raised scenes,the transfer performance of the model is verified in three scenarios of high separation,deep adhesion,and pigpen occlusion.The experiments show that the segmentation results of adding attention information,especially the simultaneous fusion of channel and spatial attention blocks,are more refined and complete.The attention-based individual pig image segmentation model can be effectively transferred to the field of group-raised pigs and can provide a reference for its pre-segmentation.展开更多
Convolution neural networks(CNNs)have proven to be effective clinical imagingmethods.This study highlighted some of the key issues within these systems.It is difficult to train these systems in a limited clinical imag...Convolution neural networks(CNNs)have proven to be effective clinical imagingmethods.This study highlighted some of the key issues within these systems.It is difficult to train these systems in a limited clinical image databases,and many publications present strategies including such learning algorithm.Furthermore,these patterns are known formaking a highly reliable prognosis.In addition,normalization of volume and losses of dice have been used effectively to accelerate and stabilize the training.Furthermore,these systems are improperly regulated,resulting in more confident ratings for correct and incorrect classification,which are inaccurate and difficult to understand.This study examines the risk assessment of Fully Convolutional Neural Networks(FCNNs)for clinical image segmentation.Essential contributions have been made to this planned work:1)dice loss and cross-entropy loss are compared on the basis of segment quality and uncertain assessment of FCNNs;2)proposal for a group model for assurance measurement of full convolutional neural networks trained with dice loss and group normalization;And 3)the ability of the measured FCNs to evaluate the segment quality of the structures and to identify test examples outside the distribution.To evaluate the study’s contributions,it conducted a series of tests in three clinical image division applications such as heart,brain and prostate.The findings of the study provide significant insights into the predictive ambiguity assessment and a practical strategies for outside-distribution identification and reliable measurement in the clinical image segmentation.The approaches presented in this research significantly enhance the reliability and accuracy rating of CNNbased clinical imaging methods.展开更多
Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propos...Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN.展开更多
We propose a multi-focus image fusion method, in which a fully convolutional network for focus detection(FD-FCN) is constructed. To obtain more precise focus detection maps, we propose to add skip layers in the networ...We propose a multi-focus image fusion method, in which a fully convolutional network for focus detection(FD-FCN) is constructed. To obtain more precise focus detection maps, we propose to add skip layers in the network to make both detailed and abstract visual information available when using FD-FCN to generate maps. A new training dataset for the proposed network is constructed based on dataset CIFAR-10. The image fusion algorithm using FD-FCN contains three steps: focus maps are obtained using FD-FCN, decision map generation occurs by applying a morphological process on the focus maps, and image fusion occurs using a decision map. We carry out several sets of experiments, and both subjective and objective assessments demonstrate the superiority of the proposed fusion method to state-of-the-art algorithms.展开更多
As one chemical composition,nicotine content has an important influence on the quality of tobacco leaves.Rapid and nondestructive quantitative analysis of nicotine is an important task in the tobacco industry.Near-inf...As one chemical composition,nicotine content has an important influence on the quality of tobacco leaves.Rapid and nondestructive quantitative analysis of nicotine is an important task in the tobacco industry.Near-infrared(NIR)spectroscopy as an effective chemical composition analysis technique has been widely used.In this paper,we propose a one-dimensional fully convolutional network(1D-FCN)model to quantitatively analyze the nicotine composition of tobacco leaves using NIR spectroscopy data in a cloud environment.This 1D-FCN model uses one-dimensional convolution layers to directly extract the complex features from sequential spectroscopy data.It consists of five convolutional layers and two full connection layers with the max-pooling layer replaced by a convolutional layer to avoid information loss.Cloud computing techniques are used to solve the increasing requests of large-size data analysis and implement data sharing and accessing.Experimental results show that the proposed 1D-FCN model can effectively extract the complex characteristics inside the spectrum and more accurately predict the nicotine volumes in tobacco leaves than other approaches.This research provides a deep learning foundation for quantitative analysis of NIR spectral data in the tobacco industry.展开更多
The crack is a common pavement failure problem.A lack of periodic maintenance will result in extending the cracks and damage the pavement,which will affect the normal use of the road.Therefore,it is significant to est...The crack is a common pavement failure problem.A lack of periodic maintenance will result in extending the cracks and damage the pavement,which will affect the normal use of the road.Therefore,it is significant to establish an efficient intelligent identification model for pavement cracks.The neural network is a method of simulating animal nervous systems using gradient descent to predict results by learning a weight matrix.It has been widely used in geotechnical engineering,computer vision,medicine,and other fields.However,there are three major problems in the application of neural networks to crack identification.There are too few layers,extracted crack features are not complete,and the method lacks the efficiency to calculate the whole picture.In this study,a fully convolutional neural network based on ResNet-101 is used to establish an intelligent identification model of pavement crack regions.This method,using a convolutional layer instead of a fully connected layer,realizes full convolution and accelerates calculation.The region proposals come from the feature map at the end of the base network,which avoids multiple computations of the same picture.Online hard example mining and data-augmentation techniques are adopted to improve the model’s recognition accuracy.We trained and tested Concrete Crack Images for Classification(CCIC),which is a public dataset collected using smartphones,and the Crack Image Database(CIDB),which was automatically collected using vehicle-mounted charge-coupled device cameras,with identification accuracy reaching 91.4%and 86.4%,respectively.The proposed model has a higher recognition accuracy and recall rate than Faster RCNN and different depth models,and can extract more complete and accurate crack features in CIDB.We also analyzed translation processing,fuzzy,scaling,and distorted images.The proposed model shows a strong robustness and stability,and can automatically identify image cracks of different forms.It has broad application prospects in practical engineering problems.展开更多
In this paper, the complete process of constructing 3D digital core by fullconvolutional neural network is described carefully. A large number of sandstone computedtomography (CT) images are used as training input for...In this paper, the complete process of constructing 3D digital core by fullconvolutional neural network is described carefully. A large number of sandstone computedtomography (CT) images are used as training input for a fully convolutional neural networkmodel. This model is used to reconstruct the three-dimensional (3D) digital core of Bereasandstone based on a small number of CT images. The Hamming distance together with theMinkowski functions for porosity, average volume specifi c surface area, average curvature,and connectivity of both the real core and the digital reconstruction are used to evaluate theaccuracy of the proposed method. The results show that the reconstruction achieved relativeerrors of 6.26%, 1.40%, 6.06%, and 4.91% for the four Minkowski functions and a Hammingdistance of 0.04479. This demonstrates that the proposed method can not only reconstructthe physical properties of real sandstone but can also restore the real characteristics of poredistribution in sandstone, is the ability to which is a new way to characterize the internalmicrostructure of rocks.展开更多
Anchor-free object-detection methods achieve a significant advancement in field of computer vision,particularly in the realm of real-time inferences.However,in remote sensing object detection,anchor-free methods often...Anchor-free object-detection methods achieve a significant advancement in field of computer vision,particularly in the realm of real-time inferences.However,in remote sensing object detection,anchor-free methods often lack of capability in separating the foreground and background.This paper proposes an anchor-free method named probability-enhanced anchor-free detector(ProEnDet)for remote sensing object detection.First,a weighted bidirectional feature pyramid is used for feature extraction.Second,we introduce probability enhancement to strengthen the classification of the object’s foreground and background.The detector uses the logarithm likelihood as the final score to improve the classification of the foreground and background of the object.ProEnDet is verified using the DIOR and NWPU-VHR-10 datasets.The experiment achieved mean average precisions of 61.4 and 69.0 on the DIOR dataset and NWPU-VHR-10 dataset,respectively.ProEnDet achieves a speed of 32.4 FPS on the DIOR dataset,which satisfies the real-time requirements for remote-sensing object detection.展开更多
Realizing accurate perception of urban boundary changes is conducive to the formulation of regional development planning and researches of urban sustainable development.In this paper,an improved fully convolution neur...Realizing accurate perception of urban boundary changes is conducive to the formulation of regional development planning and researches of urban sustainable development.In this paper,an improved fully convolution neural network was provided for perceiving large-scale urban change,by modifying network structure and updating network strategy to extract richer feature information,and to meet the requirement of urban construction land extraction under the background of large-scale low-resolution image.This paper takes the Yangtze River Economic Belt of China as an empirical object to verify the practicability of the network,the results show the extraction results of the improved fully convolutional neural network model reached a precision of kappa coefficient of 0.88,which is better than traditional fully convolutional neural networks,it performs well in the construction land extraction at the scale of small and medium-sized cities.展开更多
Numerical weather prediction of wind speed requires statistical postprocessing of systematic errors to obtain reliable and accurate forecasts.However,use of postprocessing models is often undesirable for extreme weath...Numerical weather prediction of wind speed requires statistical postprocessing of systematic errors to obtain reliable and accurate forecasts.However,use of postprocessing models is often undesirable for extreme weather events such as gales.Here,we propose a postprocessing algorithm based on a gale-aware deep attention network to simultaneously improve wind speed forecasts and gale area warnings.Specifically,the algorithm includes both a galeaware loss function that focuses the model on potential gale areas,and an observation station supervision strategy that alleviates the problem of missing extreme values caused by data gridding.The effectiveness of the proposed model was verified by using data from 235 wind speed observation stations.Experimental results show that our model can produce wind speed forecasts with a root-mean-square error of 1.1547 m s^(-1),and a Hanssen–Kuipers discriminant score of 0.517,performance that is superior to that of the other postprocessing algorithms considered.展开更多
Growing demand for seafood and reduced fishery harvests have raised intensive farming of marine aquaculture in coastal regions,which may cause severe coastal water problems without adequate environmental management.Ef...Growing demand for seafood and reduced fishery harvests have raised intensive farming of marine aquaculture in coastal regions,which may cause severe coastal water problems without adequate environmental management.Effective mapping of mariculture areas is essential for the protection of coastal environments.However,due to the limited spatial coverage and complex structures,it is still challenging for traditional methods to accurately extract mariculture areas from medium spatial resolution(MSR)images.To solve this problem,we propose to use the full resolution cascade convolutional neural network(FRCNet),which maintains effective features over the whole training process,to identify mariculture areas from MSR images.Specifically,the FRCNet uses a sequential full resolution neural network as the first-level subnetwork,and gradually aggregates higher-level subnetworks in a cascade way.Meanwhile,we perform a repeated fusion strategy so that features can receive information from different subnetworks simultaneously,leading to rich and representative features.As a result,FRCNet can effectively recognize different kinds of mariculture areas from MSR images.Results show that FRCNet obtained better performance than other classical and recently proposed methods.Our developed methods can provide valuable datasets for large-scale and intelligent modeling of the marine aquaculture management and coastal zone planning.展开更多
With the rapid progress of deep convolutional neural networks,several applications of crowd counting have been proposed and explored in the literature.In congested scene monitoring,a variety of crowd density estimatin...With the rapid progress of deep convolutional neural networks,several applications of crowd counting have been proposed and explored in the literature.In congested scene monitoring,a variety of crowd density estimating approaches has been developed.The understanding of highly congested scenes for crowd counting during Muslim gatherings of Hajj and Umrah is a challenging task,as a large number of individuals stand nearby and,it is hard for detection techniques to recognize them,as the crowd can vary from low density to high density.To deal with such highly congested scenes,we have proposed the Congested Scene Crowd Counting Network(CSCC-Net)using VGG-16 as a core network with its first ten layers due to its strong and robust transfer learning rate.A hole dilated convolutional neural network is used at the back end to widen the relevant field to extract a large range of information from the image without losing its original resolution.The dilated convolution neural network is mainly chosen to expand the kernel size without changing other parameters.Moreover,several loss functions have been applied to strengthen the evaluation accuracy of the model.Finally,the entire experiments have been evaluated using prominent data sets namely,ShanghaiTech parts A,B,UCF_CC_50,and UCF_QNRF.Our model has achieved remarkable results i.e.,68.0 and 9.0 MAE on ShanghaiTech parts A,B,199.1 MAE on UCF_CC_50,and 99.8 on UCF_QNRF data sets respectively.展开更多
In the daily application of an iris-recognition-at-a-distance(IAAD)system,many ocular images of low quality are acquired.As the iris part of these images is often not qualified for the recognition requirements,the mor...In the daily application of an iris-recognition-at-a-distance(IAAD)system,many ocular images of low quality are acquired.As the iris part of these images is often not qualified for the recognition requirements,the more accessible periocular regions are a good complement for recognition.To further boost the performance of IAAD systems,a novel end-to-end framework for multi-modal ocular recognition is proposed.The proposed framework mainly consists of iris/periocular feature extraction and matching,unsupervised iris quality assessment,and a score-level adaptive weighted fusion strategy.First,ocular feature reconstruction(OFR)is proposed to sparsely reconstruct each probe image by high-quality gallery images based on proper feature maps.Next,a brand new unsupervised iris quality assessment method based on random multiscale embedding robustness is proposed.Different from the existing iris quality assess-ment methods,the quality of an iris image is measured by its robustness in the embedding space.At last,the fusion strategy exploits the iris quality score as the fusion weight to coalesce the complementary information from the iris and periocular regions.Extensive experi-mental results on ocular datasets prove that the proposed method is obviously better than unimodal biometrics,and the fusion strategy can significantly improve therecognition performance.展开更多
Landslide detection is a hot topic in the remote sensing community,particularly with the current rapid growth in volume(and variety)of Earth observation data and the substantial progress of computer vision.Deep learni...Landslide detection is a hot topic in the remote sensing community,particularly with the current rapid growth in volume(and variety)of Earth observation data and the substantial progress of computer vision.Deep learning algorithms,especially fully convolutional networks(FCNs),and variations like the ResU-Net have been used recently as rapid and automatic landslide detection approaches.Although FCNs have shown cutting-edge results in automatic landslide detection,accuracy can be improved by adding prior knowledge through possible frameworks.This study evaluates a rulebased object-based image analysis(OBIA)approach built on probabilities resulting from the ResU-Net model for landslide detection.We train the ResU-Net model using a landslide dataset comprising landslide inventories from various geographic regions,including our study area and test the testing area not used for training.In the OBIA stage,we frst calculate land cover and image difference indices for pre-and post-landslide multi-temporal images.Next,we use the generated indices and the resulting ResU-Net probabilities for image segmentation;the extracted landslide object candidates are then optimized using rule-based classification.In the result validation section,the landslide detection of the proposed integration of the ResU-Net with a rule-based classification of OBIA is compared with that of the ResU-Net alone.Our proposed approach improves the mean intersection-over-union of the resulting map from the ResU-Net by more than 22%.展开更多
Determining the navigation line is critical for the automatic navigation of agricultural robots in the farmland.In this research,considering a wheat field as the typical scenario,a novel navigation line extraction alg...Determining the navigation line is critical for the automatic navigation of agricultural robots in the farmland.In this research,considering a wheat field as the typical scenario,a novel navigation line extraction algorithm based on semantic segmentation is proposed.The data containing horizontal parallax,height,and grayscale information(HHG)is constructed by combining re-encoded depth data and red-green-blue(RGB)data.The HHG,RGB,and depth data are used to achieve scene recognition and navigation line extraction for a wheat field.The method includes two main steps.First,the semantic segmentation of the wheat,ground,and background are performed using a fully convolutional network(FCN).Second,the navigation line is fitted in the camera coordinate system on the basis of the semantic segmentation result and the principle of camera pinhole imaging.Our segmentation model is trained using 508 randomly selected images from a data set,and the model is tested on 199 images.When labelled data are used as the reference benchmark,the mean intersection over union(mIoU)of the HHG data is greater than 95%,which is the highest among the three types of data.The semantic segmentation methods based on the RGB and HHG data show higher navigation line extraction accuracy rates(with the absolute value of the angle deviation less than 5)than the compared methods.The mean and standard deviation of the angle deviation of the two methods are within 0.1and 2.0,while the mean and standard deviation of the distance deviation are less than 30 mm and 60 mm,respectively.These values meet the basic requirements of agricultural machinery field navigation.The novelty of this work is the proposal of a navigation line extraction algorithm based on semantic segmentation in wheat fields.This method is high in accuracy and robustness to interference from crop occlusion.展开更多
文摘Panoramic images are widely used in many scenes,especially in virtual reality and street view capture.However,they are new for street furniture identification which is usually based on mobile laser scanning point cloud data or conventional 2D images.This study proposes to perform semantic segmentation on panoramic images and transformed images to separate light poles and traffic signs from background implemented by pre-trained Fully Convolutional Networks(FCN).FCN is the most important model for deep learning applied on semantic segmentation for its end to end training process and pixel-wise prediction.In this study,we use FCN-8s model that pre-trained on cityscape dataset and finetune it by our own data.Then replace cross entropy loss function with focal loss function in the FCN model and train it again to produce the predictions.The results show that in all results from pre-trained model,fine-tuning,and FCN model with focal loss,the light poles and traffic signs are detected well and the transformed images have better performance than panoramic images in the prediction according to the Recall and IoU evaluation.
基金Foundation of Anhui Province Key Laboratory of Physical Geographic Environment(No.2022PGE012)
文摘Accurate boundaries of smallholder farm fields are important and indispensable geo-information that benefits farmers,managers,and policymakers in terms of better managing and utilizing their agricultural resources.Due to their small size,irregular shape,and the use of mixed-cropping techniques,the farm fields of smallholder can be difficult to delineate automatically.In recent years,numerous studies on field contour extraction using a deep Convolutional Neural Network(CNN)have been proposed.However,there is a relative shortage of labeled data for filed boundaries,thus affecting the training effect of CNN.Traditional methods mostly use image flipping,and random rotation for data augmentation.In this paper,we propose to apply Generative Adversarial Network(GAN)for the data augmentation of farm fields label to increase the diversity of samples.Specifically,we propose an automated method featured by Fully Convolutional Neural networks(FCN)in combination with GAN to improve the delineation accuracy of smallholder farms from Very High Resolution(VHR)images.We first investigate four State-Of-The-Art(SOTA)FCN architectures,i.e.,U-Net,PSPNet,SegNet and OCRNet,to find the optimal architecture in the contour detection task of smallholder farm fields.Second,we apply the identified optimal FCN architecture in combination with Contour GAN and pixel2pixel GAN to improve the accuracy of contour detection.We test our method on the study area in the Sudano-Sahelian savanna region of northern Nigeria.The best combination achieved F1 scores of 0.686 on Test Set 1(TS1),0.684 on Test Set 2(TS2),and 0.691 on Test Set 3(TS3).Results indicate that our architecture adapts to a variety of advanced networks and proves its effectiveness in this task.The conceptual,theoretical,and experimental knowledge from this study is expected to seed many GAN-based farm delineation methods in the future.
基金supported by the National Natural Science Foundation of China(Grant No.31671571)the Shanxi Province Basic Research Program Project(Free Exploration)(No.20210302124523,20210302123408,202103021224149,and 202103021223141)the Youth Agricultural Science and Technology Innovation Fund of Shanxi Agricultural University(Grant No.2019027)。
文摘The separation of individual pigs from the pigpen scenes is crucial for precision farming,and the technology based on convolutional neural networks can provide a low-cost,non-contact,non-invasive method of pig image segmentation.However,two factors limit the development of this field.On the one hand,the individual pigs are easy to stick together,and the occlusion of debris such as pigpens can easily make the model misjudgment.On the other hand,manual labeling of group-raised pig data is time-consuming and labor-intensive and is prone to labeling errors.Therefore,it is urgent for an individual pig image segmentation model that can perform well in individual scenarios and can be easily migrated to a group-raised environment.In order to solve the above problems,taking individual pigs as research objects,an individual pig image segmentation dataset containing 2066 images was constructed,and a series of algorithms based on fully convolutional networks were proposed to solve the pig image segmentation problem.In order to capture the long-range dependencies and weaken the background information such as pigpens while enhancing the information of individual parts of pigs,the channel and spatial attention blocks were introduced into the best-performing decoders UNet and LinkNet.Experiments show that using ResNext50 as the encoder and Unet as the decoder as the basic model,adding two attention blocks at the same time achieves 98.30%and 96.71%on the F1 and IOU metrics,respectively.Compared with the model adding channel attention block alone,the two metrics are improved by 0.13%and 0.22%,respectively.The experiment of introducing channel and spatial attention alone shows that spatial attention is more effective than channel attention.Taking VGG16-LinkNet as an example,compared with channel attention,spatial attention improves the F1 and IOU metrics by 0.16%and 0.30%,respectively.Furthermore,the heatmap of the feature of different layers of the decoder after adding different attention information proves that with the increase of layers,the boundary of pig image segmentation is clearer.In order to verify the effectiveness of the individual pig image segmentation model in group-raised scenes,the transfer performance of the model is verified in three scenarios of high separation,deep adhesion,and pigpen occlusion.The experiments show that the segmentation results of adding attention information,especially the simultaneous fusion of channel and spatial attention blocks,are more refined and complete.The attention-based individual pig image segmentation model can be effectively transferred to the field of group-raised pigs and can provide a reference for its pre-segmentation.
文摘Convolution neural networks(CNNs)have proven to be effective clinical imagingmethods.This study highlighted some of the key issues within these systems.It is difficult to train these systems in a limited clinical image databases,and many publications present strategies including such learning algorithm.Furthermore,these patterns are known formaking a highly reliable prognosis.In addition,normalization of volume and losses of dice have been used effectively to accelerate and stabilize the training.Furthermore,these systems are improperly regulated,resulting in more confident ratings for correct and incorrect classification,which are inaccurate and difficult to understand.This study examines the risk assessment of Fully Convolutional Neural Networks(FCNNs)for clinical image segmentation.Essential contributions have been made to this planned work:1)dice loss and cross-entropy loss are compared on the basis of segment quality and uncertain assessment of FCNNs;2)proposal for a group model for assurance measurement of full convolutional neural networks trained with dice loss and group normalization;And 3)the ability of the measured FCNs to evaluate the segment quality of the structures and to identify test examples outside the distribution.To evaluate the study’s contributions,it conducted a series of tests in three clinical image division applications such as heart,brain and prostate.The findings of the study provide significant insights into the predictive ambiguity assessment and a practical strategies for outside-distribution identification and reliable measurement in the clinical image segmentation.The approaches presented in this research significantly enhance the reliability and accuracy rating of CNNbased clinical imaging methods.
基金supported by the National Natural Science Foundation of China[grant number 41671452].
文摘Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN.
基金Project supported by the National Natural Science Foundation of China(No.61801190)the Natural Science Foundation of Jilin Province,China(No.20180101055JC)the Outstanding Young Talent Foundation of Jilin Province,China(No.20180520029JH)。
文摘We propose a multi-focus image fusion method, in which a fully convolutional network for focus detection(FD-FCN) is constructed. To obtain more precise focus detection maps, we propose to add skip layers in the network to make both detailed and abstract visual information available when using FD-FCN to generate maps. A new training dataset for the proposed network is constructed based on dataset CIFAR-10. The image fusion algorithm using FD-FCN contains three steps: focus maps are obtained using FD-FCN, decision map generation occurs by applying a morphological process on the focus maps, and image fusion occurs using a decision map. We carry out several sets of experiments, and both subjective and objective assessments demonstrate the superiority of the proposed fusion method to state-of-the-art algorithms.
文摘As one chemical composition,nicotine content has an important influence on the quality of tobacco leaves.Rapid and nondestructive quantitative analysis of nicotine is an important task in the tobacco industry.Near-infrared(NIR)spectroscopy as an effective chemical composition analysis technique has been widely used.In this paper,we propose a one-dimensional fully convolutional network(1D-FCN)model to quantitatively analyze the nicotine composition of tobacco leaves using NIR spectroscopy data in a cloud environment.This 1D-FCN model uses one-dimensional convolution layers to directly extract the complex features from sequential spectroscopy data.It consists of five convolutional layers and two full connection layers with the max-pooling layer replaced by a convolutional layer to avoid information loss.Cloud computing techniques are used to solve the increasing requests of large-size data analysis and implement data sharing and accessing.Experimental results show that the proposed 1D-FCN model can effectively extract the complex characteristics inside the spectrum and more accurately predict the nicotine volumes in tobacco leaves than other approaches.This research provides a deep learning foundation for quantitative analysis of NIR spectral data in the tobacco industry.
基金funded by the National Key Research and Development Program of China(No.2017YFC1501200)the National Natural Science Foundation of China(Nos.51678536,41404096)+2 种基金supported by Department of education’s Production-Study-Research combined innovation Funding-“Blue fire plan(Huizhou)”(CXZJHZ01742)the Program for Science and Technology Innovation Talents in Universities of Henan Province(Grant No.19HASTIT043)the Outstanding Young Talent Research Fund of Zhengzhou University(1621323001).
文摘The crack is a common pavement failure problem.A lack of periodic maintenance will result in extending the cracks and damage the pavement,which will affect the normal use of the road.Therefore,it is significant to establish an efficient intelligent identification model for pavement cracks.The neural network is a method of simulating animal nervous systems using gradient descent to predict results by learning a weight matrix.It has been widely used in geotechnical engineering,computer vision,medicine,and other fields.However,there are three major problems in the application of neural networks to crack identification.There are too few layers,extracted crack features are not complete,and the method lacks the efficiency to calculate the whole picture.In this study,a fully convolutional neural network based on ResNet-101 is used to establish an intelligent identification model of pavement crack regions.This method,using a convolutional layer instead of a fully connected layer,realizes full convolution and accelerates calculation.The region proposals come from the feature map at the end of the base network,which avoids multiple computations of the same picture.Online hard example mining and data-augmentation techniques are adopted to improve the model’s recognition accuracy.We trained and tested Concrete Crack Images for Classification(CCIC),which is a public dataset collected using smartphones,and the Crack Image Database(CIDB),which was automatically collected using vehicle-mounted charge-coupled device cameras,with identification accuracy reaching 91.4%and 86.4%,respectively.The proposed model has a higher recognition accuracy and recall rate than Faster RCNN and different depth models,and can extract more complete and accurate crack features in CIDB.We also analyzed translation processing,fuzzy,scaling,and distorted images.The proposed model shows a strong robustness and stability,and can automatically identify image cracks of different forms.It has broad application prospects in practical engineering problems.
基金the National Natural Science Foundation of China(No.41274129)Chuan Qing Drilling Engineering Company's Scientific Research Project:Seismic detection technology and application of complex carbonate reservoir in Sulige Majiagou Formation and the 2018 Central Supporting Local Co-construction Fund(No.80000-18Z0140504)the Construction and Development of Universities in 2019-Joint Support for Geophysics(Double First-Class center,80000-19Z0204)。
文摘In this paper, the complete process of constructing 3D digital core by fullconvolutional neural network is described carefully. A large number of sandstone computedtomography (CT) images are used as training input for a fully convolutional neural networkmodel. This model is used to reconstruct the three-dimensional (3D) digital core of Bereasandstone based on a small number of CT images. The Hamming distance together with theMinkowski functions for porosity, average volume specifi c surface area, average curvature,and connectivity of both the real core and the digital reconstruction are used to evaluate theaccuracy of the proposed method. The results show that the reconstruction achieved relativeerrors of 6.26%, 1.40%, 6.06%, and 4.91% for the four Minkowski functions and a Hammingdistance of 0.04479. This demonstrates that the proposed method can not only reconstructthe physical properties of real sandstone but can also restore the real characteristics of poredistribution in sandstone, is the ability to which is a new way to characterize the internalmicrostructure of rocks.
基金supported in part by the National Natural Science Foundation of China(42001408).
文摘Anchor-free object-detection methods achieve a significant advancement in field of computer vision,particularly in the realm of real-time inferences.However,in remote sensing object detection,anchor-free methods often lack of capability in separating the foreground and background.This paper proposes an anchor-free method named probability-enhanced anchor-free detector(ProEnDet)for remote sensing object detection.First,a weighted bidirectional feature pyramid is used for feature extraction.Second,we introduce probability enhancement to strengthen the classification of the object’s foreground and background.The detector uses the logarithm likelihood as the final score to improve the classification of the foreground and background of the object.ProEnDet is verified using the DIOR and NWPU-VHR-10 datasets.The experiment achieved mean average precisions of 61.4 and 69.0 on the DIOR dataset and NWPU-VHR-10 dataset,respectively.ProEnDet achieves a speed of 32.4 FPS on the DIOR dataset,which satisfies the real-time requirements for remote-sensing object detection.
基金supported by Natural Science Foundation of Chongqing in China(No.cstc2020jcyj-jqX0004)the Ministry of education of Humanities and Social Science project(No.20YJA790016)+1 种基金the National Natural Science Foundation of China(Grant No.42171298)We thank the patent supporting the method section of the paper(No.202110750360.1).
文摘Realizing accurate perception of urban boundary changes is conducive to the formulation of regional development planning and researches of urban sustainable development.In this paper,an improved fully convolution neural network was provided for perceiving large-scale urban change,by modifying network structure and updating network strategy to extract richer feature information,and to meet the requirement of urban construction land extraction under the background of large-scale low-resolution image.This paper takes the Yangtze River Economic Belt of China as an empirical object to verify the practicability of the network,the results show the extraction results of the improved fully convolutional neural network model reached a precision of kappa coefficient of 0.88,which is better than traditional fully convolutional neural networks,it performs well in the construction land extraction at the scale of small and medium-sized cities.
基金Supported by the National Natural Science Foundation of China (62106169)。
文摘Numerical weather prediction of wind speed requires statistical postprocessing of systematic errors to obtain reliable and accurate forecasts.However,use of postprocessing models is often undesirable for extreme weather events such as gales.Here,we propose a postprocessing algorithm based on a gale-aware deep attention network to simultaneously improve wind speed forecasts and gale area warnings.Specifically,the algorithm includes both a galeaware loss function that focuses the model on potential gale areas,and an observation station supervision strategy that alleviates the problem of missing extreme values caused by data gridding.The effectiveness of the proposed model was verified by using data from 235 wind speed observation stations.Experimental results show that our model can produce wind speed forecasts with a root-mean-square error of 1.1547 m s^(-1),and a Hanssen–Kuipers discriminant score of 0.517,performance that is superior to that of the other postprocessing algorithms considered.
基金supported by the National Natural Science Foundation of China[grant numbers 42101404,42107498]the National Key Research and Development Program of China[grant number 2020YFC1807501].
文摘Growing demand for seafood and reduced fishery harvests have raised intensive farming of marine aquaculture in coastal regions,which may cause severe coastal water problems without adequate environmental management.Effective mapping of mariculture areas is essential for the protection of coastal environments.However,due to the limited spatial coverage and complex structures,it is still challenging for traditional methods to accurately extract mariculture areas from medium spatial resolution(MSR)images.To solve this problem,we propose to use the full resolution cascade convolutional neural network(FRCNet),which maintains effective features over the whole training process,to identify mariculture areas from MSR images.Specifically,the FRCNet uses a sequential full resolution neural network as the first-level subnetwork,and gradually aggregates higher-level subnetworks in a cascade way.Meanwhile,we perform a repeated fusion strategy so that features can receive information from different subnetworks simultaneously,leading to rich and representative features.As a result,FRCNet can effectively recognize different kinds of mariculture areas from MSR images.Results show that FRCNet obtained better performance than other classical and recently proposed methods.Our developed methods can provide valuable datasets for large-scale and intelligent modeling of the marine aquaculture management and coastal zone planning.
基金This research is supported by the Ministry of Education Saudi Arabia under Project Number QURDO001.
文摘With the rapid progress of deep convolutional neural networks,several applications of crowd counting have been proposed and explored in the literature.In congested scene monitoring,a variety of crowd density estimating approaches has been developed.The understanding of highly congested scenes for crowd counting during Muslim gatherings of Hajj and Umrah is a challenging task,as a large number of individuals stand nearby and,it is hard for detection techniques to recognize them,as the crowd can vary from low density to high density.To deal with such highly congested scenes,we have proposed the Congested Scene Crowd Counting Network(CSCC-Net)using VGG-16 as a core network with its first ten layers due to its strong and robust transfer learning rate.A hole dilated convolutional neural network is used at the back end to widen the relevant field to extract a large range of information from the image without losing its original resolution.The dilated convolution neural network is mainly chosen to expand the kernel size without changing other parameters.Moreover,several loss functions have been applied to strengthen the evaluation accuracy of the model.Finally,the entire experiments have been evaluated using prominent data sets namely,ShanghaiTech parts A,B,UCF_CC_50,and UCF_QNRF.Our model has achieved remarkable results i.e.,68.0 and 9.0 MAE on ShanghaiTech parts A,B,199.1 MAE on UCF_CC_50,and 99.8 on UCF_QNRF data sets respectively.
基金This work was supported by National Natural Science Foundation of China(Nos.62006225,61906199 and 62071468)the Strategic Priority Research Program of Chinese Academy of Sciences(CAS),China(No.XDA 27040700)sponsored by The Beijing Nova Program,China(Nos.Z201100006820050 and Z211100002121010).
文摘In the daily application of an iris-recognition-at-a-distance(IAAD)system,many ocular images of low quality are acquired.As the iris part of these images is often not qualified for the recognition requirements,the more accessible periocular regions are a good complement for recognition.To further boost the performance of IAAD systems,a novel end-to-end framework for multi-modal ocular recognition is proposed.The proposed framework mainly consists of iris/periocular feature extraction and matching,unsupervised iris quality assessment,and a score-level adaptive weighted fusion strategy.First,ocular feature reconstruction(OFR)is proposed to sparsely reconstruct each probe image by high-quality gallery images based on proper feature maps.Next,a brand new unsupervised iris quality assessment method based on random multiscale embedding robustness is proposed.Different from the existing iris quality assess-ment methods,the quality of an iris image is measured by its robustness in the embedding space.At last,the fusion strategy exploits the iris quality score as the fusion weight to coalesce the complementary information from the iris and periocular regions.Extensive experi-mental results on ocular datasets prove that the proposed method is obviously better than unimodal biometrics,and the fusion strategy can significantly improve therecognition performance.
基金funded by the Institute of Advanced Research in Artificial Intelligence(IARAl)GmbHInstitute of Advanced Research in Artificial Intelligence(IARAl)GmbH Address:LandstraBer HauptstraBe 5,1030 Vienna,Austria[VAT number(UID):ATU74131236].
文摘Landslide detection is a hot topic in the remote sensing community,particularly with the current rapid growth in volume(and variety)of Earth observation data and the substantial progress of computer vision.Deep learning algorithms,especially fully convolutional networks(FCNs),and variations like the ResU-Net have been used recently as rapid and automatic landslide detection approaches.Although FCNs have shown cutting-edge results in automatic landslide detection,accuracy can be improved by adding prior knowledge through possible frameworks.This study evaluates a rulebased object-based image analysis(OBIA)approach built on probabilities resulting from the ResU-Net model for landslide detection.We train the ResU-Net model using a landslide dataset comprising landslide inventories from various geographic regions,including our study area and test the testing area not used for training.In the OBIA stage,we frst calculate land cover and image difference indices for pre-and post-landslide multi-temporal images.Next,we use the generated indices and the resulting ResU-Net probabilities for image segmentation;the extracted landslide object candidates are then optimized using rule-based classification.In the result validation section,the landslide detection of the proposed integration of the ResU-Net with a rule-based classification of OBIA is compared with that of the ResU-Net alone.Our proposed approach improves the mean intersection-over-union of the resulting map from the ResU-Net by more than 22%.
基金supported by the National Natural Science Foundation of China(No.61503363).
文摘Determining the navigation line is critical for the automatic navigation of agricultural robots in the farmland.In this research,considering a wheat field as the typical scenario,a novel navigation line extraction algorithm based on semantic segmentation is proposed.The data containing horizontal parallax,height,and grayscale information(HHG)is constructed by combining re-encoded depth data and red-green-blue(RGB)data.The HHG,RGB,and depth data are used to achieve scene recognition and navigation line extraction for a wheat field.The method includes two main steps.First,the semantic segmentation of the wheat,ground,and background are performed using a fully convolutional network(FCN).Second,the navigation line is fitted in the camera coordinate system on the basis of the semantic segmentation result and the principle of camera pinhole imaging.Our segmentation model is trained using 508 randomly selected images from a data set,and the model is tested on 199 images.When labelled data are used as the reference benchmark,the mean intersection over union(mIoU)of the HHG data is greater than 95%,which is the highest among the three types of data.The semantic segmentation methods based on the RGB and HHG data show higher navigation line extraction accuracy rates(with the absolute value of the angle deviation less than 5)than the compared methods.The mean and standard deviation of the angle deviation of the two methods are within 0.1and 2.0,while the mean and standard deviation of the distance deviation are less than 30 mm and 60 mm,respectively.These values meet the basic requirements of agricultural machinery field navigation.The novelty of this work is the proposal of a navigation line extraction algorithm based on semantic segmentation in wheat fields.This method is high in accuracy and robustness to interference from crop occlusion.