[Objective]Urban floods are occurring more frequently because of global climate change and urbanization.Accordingly,urban rainstorm and flood forecasting has become a priority in urban hydrology research.However,two-d...[Objective]Urban floods are occurring more frequently because of global climate change and urbanization.Accordingly,urban rainstorm and flood forecasting has become a priority in urban hydrology research.However,two-dimensional hydrodynamic models execute calculations slowly,hindering the rapid simulation and forecasting of urban floods.To overcome this limitation and accelerate the speed and improve the accuracy of urban flood simulations and forecasting,numerical simulations and deep learning were combined to develop a more effective urban flood forecasting method.[Methods]Specifically,a cellular automata model was used to simulate the urban flood process and address the need to include a large number of datasets in the deep learning process.Meanwhile,to shorten the time required for urban flood forecasting,a convolutional neural network model was used to establish the mapping relationship between rainfall and inundation depth.[Results]The results show that the relative error of forecasting the maximum inundation depth in flood-prone locations is less than 10%,and the Nash efficiency coefficient of forecasting inundation depth series in flood-prone locations is greater than 0.75.[Conclusion]The result demonstrated that the proposed method could execute highly accurate simulations and quickly produce forecasts,illustrating its superiority as an urban flood forecasting technique.展开更多
In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid...In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid network(FPN)and deconvolutional single shot detector(DSSD),where the bottom layer of the feature pyramid network relies on the top layer,NFPN builds the feature pyramid network with no connections between the upper and lower layers.That is,it only fuses shallow features on similar scales.NFPN is highly portable and can be embedded in many models to further boost performance.Extensive experiments on PASCAL VOC 2007,2012,and COCO datasets demonstrate that the NFPN-based SSD without intricate tricks can exceed the DSSD model in terms of detection accuracy and inference speed,especially for small objects,e.g.,4%to 5%higher mAP(mean average precision)than SSD,and 2%to 3%higher mAP than DSSD.On VOC 2007 test set,the NFPN-based SSD with 300×300 input reaches 79.4%mAP at 34.6 frame/s,and the mAP can raise to 82.9%after using the multi-scale testing strategy.展开更多
With the continuous increase in the number of flights,the use of airport collaborative decision-making(ACDM)systems has been more and more widely spread.The accuracy of the taxi time prediction has an important effect...With the continuous increase in the number of flights,the use of airport collaborative decision-making(ACDM)systems has been more and more widely spread.The accuracy of the taxi time prediction has an important effect on the A-CDM calculation of the departure aircraft’s take-off queue and the accurate time for the aircraft blockout.The spatial-temporal-environment deep learning(STEDL)model is presented to improve the prediction accuracy of departure aircraft taxi-out time.The model is composed of time-flow sub-model(airport capacity,number of taxiing aircraft,and different time periods),spatial sub-model(taxiing distance)and environmental sub-model(weather,air traffic control,runway configuration,and aircraft category).The STEDL model is used to predict the taxi time of departure aircraft at Hong Kong Airport and the results show that the STEDL method has a prediction accuracy of 95.4%.The proposed model also greatly reduces the prediction error rate compared with the other machine learning methods.展开更多
Objective To propose two novel methods based on deep learning for computer-aided tongue diagnosis,including tongue image segmentation and tongue color classification,improving their diagnostic accuracy.Methods LabelMe...Objective To propose two novel methods based on deep learning for computer-aided tongue diagnosis,including tongue image segmentation and tongue color classification,improving their diagnostic accuracy.Methods LabelMe was used to label the tongue mask and Snake model to optimize the labeling results.A new dataset was constructed for tongue image segmentation.Tongue color was marked to build a classified dataset for network training.In this research,the Inception+Atrous Spatial Pyramid Pooling(ASPP)+UNet(IAUNet)method was proposed for tongue image segmentation,based on the existing UNet,Inception,and atrous convolution.Moreover,the Tongue Color Classification Net(TCCNet)was constructed with reference to ResNet,Inception,and Triple-Loss.Several important measurement indexes were selected to evaluate and compare the effects of the novel and existing methods for tongue segmentation and tongue color classification.IAUNet was compared with existing mainstream methods such as UNet and DeepLabV3+for tongue segmentation.TCCNet for tongue color classification was compared with VGG16 and GoogLeNet.Results IAUNet can accurately segment the tongue from original images.The results showed that the Mean Intersection over Union(MIoU)of IAUNet reached 96.30%,and its Mean Pixel Accuracy(MPA),mean Average Precision(mAP),F1-Score,G-Score,and Area Under Curve(AUC)reached 97.86%,99.18%,96.71%,96.82%,and 99.71%,respectively,suggesting IAUNet produced better segmentation than other methods,with fewer parameters.Triplet-Loss was applied in the proposed TCCNet to separate different embedded colors.The experiment yielded ideal results,with F1-Score and mAP of the TCCNet reached 88.86% and 93.49%,respectively.Conclusion IAUNet based on deep learning for tongue segmentation is better than traditional ones.IAUNet can not only produce ideal tongue segmentation,but have better effects than those of PSPNet,SegNet,UNet,and DeepLabV3+,the traditional networks.As for tongue color classification,the proposed network,TCCNet,had better F1-Score and mAP values as compared with other neural networks such as VGG16 and GoogLeNet.展开更多
Objective We developed a universal lesion detector(ULDor)which showed good performance in in-lab experiments.The study aims to evaluate the performance and its ability to generalize in clinical setting via both extern...Objective We developed a universal lesion detector(ULDor)which showed good performance in in-lab experiments.The study aims to evaluate the performance and its ability to generalize in clinical setting via both external and internal validation.Methods The ULDor system consists of a convolutional neural network(CNN)trained on around 80 K lesion annotations from about 12 K CT studies in the DeepLesion dataset and 5 other public organ-specific datasets.During the validation process,the test sets include two parts:the external validation dataset which was comprised of 164 sets of non-contrasted chest and upper abdomen CT scans from a comprehensive hospital,and the internal validation dataset which was comprised of 187 sets of low-dose helical CT scans from the National Lung Screening Trial(NLST).We ran the model on the two test sets to output lesion detection.Three board-certified radiologists read the CT scans and verified the detection results of ULDor.We used positive predictive value(PPV)and sensitivity to evaluate the performance of the model in detecting space-occupying lesions at all extra-pulmonary organs visualized on CT images,including liver,kidney,pancreas,adrenal,spleen,esophagus,thyroid,lymph nodes,body wall,thoracic spine,etc.Results In the external validation,the lesion-level PPV and sensitivity of the model were 57.9%and 67.0%,respectively.On average,the model detected 2.1 findings per set,and among them,0.9 were false positives.ULDor worked well for detecting liver lesions,with a PPV of 78.9%and a sensitivity of 92.7%,followed by kidney,with a PPV of 70.0%and a sensitivity of 58.3%.In internal validation with NLST test set,ULDor obtained a PPV of 75.3%and a sensitivity of 52.0%despite the relatively high noise level of soft tissue on images.Conclusions The performance tests of ULDor with the external real-world data have shown its high effectiveness in multiple-purposed detection for lesions in certain organs.With further optimisation and iterative upgrades,ULDor may be well suited for extensive application to external data.展开更多
In order to improve the accuracy of threaded hole object detection,combining a dual camera vision system with the Hough transform circle detection,we propose an object detection method of artifact threaded hole based ...In order to improve the accuracy of threaded hole object detection,combining a dual camera vision system with the Hough transform circle detection,we propose an object detection method of artifact threaded hole based on Faster region-ased convolutional neural network(Faster R-CNN).First,a dual camera image acquisition system is established.One industrial camera placed at a high position is responsible for collecting the whole image of the workpiece,and the suspected screw hole position on the workpiece can be preliminarily selected by Hough transform detection algorithm.Then,the other industrial camera is responsible for collecting the local images of the suspected screw holes that have been detected by Hough transform one by one.After that,ResNet50-based Faster R-CNN object detection model is trained on the self-built screw hole data set.Finally,the local image of the threaded hole is input into the trained Faster R-CNN object detection model for further identification and location.The experimental results show that the proposed method can effectively avoid small object detection of threaded holes,and compared with the method that only uses Hough transform or Faster RCNN object detection alone,it has high recognition and positioning accuracy.展开更多
Dense captioning aims to simultaneously localize and describe regions-of-interest(RoIs)in images in natural language.Specifically,we identify three key problems:1)dense and highly overlapping RoIs,making accurate loca...Dense captioning aims to simultaneously localize and describe regions-of-interest(RoIs)in images in natural language.Specifically,we identify three key problems:1)dense and highly overlapping RoIs,making accurate localization of each target region challenging;2)some visually ambiguous target regions which are hard to recognize each of them just by appearance;3)an extremely deep image representation which is of central importance for visual recognition.To tackle these three challenges,we propose a novel end-to-end dense captioning framework consisting of a joint localization module,a contextual reasoning module and a deep convolutional neural network(CNN).We also evaluate five deep CNN structures to explore the benefits of each.Extensive experiments on visual genome(VG)dataset demonstrate the effectiveness of our approach,which compares favorably with the state-of-the-art methods.展开更多
A data-driven method for arrival pattern recognition and prediction is proposed to provide air traffic controllers(ATCOs)with decision support. For arrival pattern recognition,a clustering-based method is proposed to ...A data-driven method for arrival pattern recognition and prediction is proposed to provide air traffic controllers(ATCOs)with decision support. For arrival pattern recognition,a clustering-based method is proposed to cluster arrival patterns by control intentions. For arrival pattern prediction,two predictors are trained to estimate the most possible command issued by the ATCOs in a particular traffic situation. Training the arrival pattern predictor could be regarded as building an ATCOs simulator. The simulator can assign an appropriate arrival pattern for each arrival aircraft,just like real ATCOs do. Therefore,the simulator is considered to be able to provide effective advice for part of the work of ATCOs. Finally,a case study is carried out and demonstrates that the convolutional neural network(CNN)-based predictor performs better than the radom forest(RF)-based one.展开更多
Single image super-resolution(SISR)is a fundamentally challenging problem because a low-resolution(LR)image can correspond to a set of high-resolution(HR)images,while most are not expected.Recently,SISR can be achieve...Single image super-resolution(SISR)is a fundamentally challenging problem because a low-resolution(LR)image can correspond to a set of high-resolution(HR)images,while most are not expected.Recently,SISR can be achieved by a deep learning-based method.By constructing a very deep super-resolution convolutional neural network(VDSRCNN),the LR images can be improved to HR images.This study mainly achieves two objectives:image super-resolution(ISR)and deblurring the image from VDSRCNN.Firstly,by analyzing ISR,we modify different training parameters to test the performance of VDSRCNN.Secondly,we add the motion blurred images to the training set to optimize the performance of VDSRCNN.Finally,we use image quality indexes to evaluate the difference between the images from classical methods and VDSRCNN.The results indicate that the VDSRCNN performs better in generating HR images from LR images using the optimized VDSRCNN in a proper method.展开更多
In recent years,deep learning methods have gradually come to be used in hyperspectral imaging domains.Because of the peculiarity of hyperspectral imaging,a mass of information is contained in the spectral dimensions o...In recent years,deep learning methods have gradually come to be used in hyperspectral imaging domains.Because of the peculiarity of hyperspectral imaging,a mass of information is contained in the spectral dimensions of hyperspectral images.Also,different ob jects on a land surface are sensitive to different ranges of wavelength.To achieve higher accuracy in classification,we propose a structure that combines spectral sensitivity with a convolutional neural network by adding spectral weights derived from predicted outcomes before the final classification layer.First,samples are divided into visible light and infrared,with a portion of the samples fed into networks during training.Then,two key parameters,unrecognized rate(δ)and wrongly recognized rate(γ),are calculated from the predicted outcome of the whole scene.Next,the spectral weight,derived from these two parameters,is calculated.Finally,the spectral weight is added and an improved structure is constructed.The improved structure not only combines the features in spatial and spectral dimensions,but also gives spectral sensitivity a primary status.Compared with inputs from the whole spectrum,the improved structure attains a nearly 2%higher prediction accuracy.When applied to public data sets,compared with the whole spectrum,on the average we achieve approximately 1%higher accuracy.展开更多
Background:A colonoscopy can detect colorectal diseases,including cancers,polyps,and inflammatory bowel diseases.A computer-aided diagnosis(CAD)system using deep convolutional neural networks(CNNs)that can recognize a...Background:A colonoscopy can detect colorectal diseases,including cancers,polyps,and inflammatory bowel diseases.A computer-aided diagnosis(CAD)system using deep convolutional neural networks(CNNs)that can recognize anatomical locations during a colonoscopy could efficiently assist practitioners.We aimed to construct a CAD system using a CNN to distinguish colorectal images from parts of the cecum,ascending colon,transverse colon,descending colon,sigmoid colon,and rectum.Method:We constructed a CNN by training of 9,995 colonoscopy images and tested its performance by 5,121 independent colonoscopy images that were categorized according to seven anatomical locations:the terminal ileum,the cecum,ascending colon to transverse colon,descending colon to sigmoid colon,the rectum,the anus,and indistinguishable parts.We examined images taken during total colonoscopy performed between January 2017 and November 2017 at a single center.We evaluated the concordance between the diagnosis by endoscopists and those by the CNN.The main outcomes of the study were the sensitivity and specificity of the CNN for the anatomical categorization of colonoscopy images.Results:The constructed CNN recognized anatomical locations of colonoscopy images with the following areas under the curves:0.979 for the terminal ileum;0.940 for the cecum;0.875 for ascending colon to transverse colon;0.846 for descending colon to sigmoid colon;0.835 for the rectum;and 0.992 for the anus.During the test process,the CNN system correctly recognized 66.6%of images.Conclusion:We constructed the new CNN system with clinically relevant performance for recognizing anatomical locations of colonoscopy images,which is the first step in constructing a CAD system that will support us during colonoscopy and provide an assurance of the quality of the colonoscopy procedure.展开更多
文摘[Objective]Urban floods are occurring more frequently because of global climate change and urbanization.Accordingly,urban rainstorm and flood forecasting has become a priority in urban hydrology research.However,two-dimensional hydrodynamic models execute calculations slowly,hindering the rapid simulation and forecasting of urban floods.To overcome this limitation and accelerate the speed and improve the accuracy of urban flood simulations and forecasting,numerical simulations and deep learning were combined to develop a more effective urban flood forecasting method.[Methods]Specifically,a cellular automata model was used to simulate the urban flood process and address the need to include a large number of datasets in the deep learning process.Meanwhile,to shorten the time required for urban flood forecasting,a convolutional neural network model was used to establish the mapping relationship between rainfall and inundation depth.[Results]The results show that the relative error of forecasting the maximum inundation depth in flood-prone locations is less than 10%,and the Nash efficiency coefficient of forecasting inundation depth series in flood-prone locations is greater than 0.75.[Conclusion]The result demonstrated that the proposed method could execute highly accurate simulations and quickly produce forecasts,illustrating its superiority as an urban flood forecasting technique.
基金The National Natural Science Foundation of China(No.61603091)。
文摘In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid network(FPN)and deconvolutional single shot detector(DSSD),where the bottom layer of the feature pyramid network relies on the top layer,NFPN builds the feature pyramid network with no connections between the upper and lower layers.That is,it only fuses shallow features on similar scales.NFPN is highly portable and can be embedded in many models to further boost performance.Extensive experiments on PASCAL VOC 2007,2012,and COCO datasets demonstrate that the NFPN-based SSD without intricate tricks can exceed the DSSD model in terms of detection accuracy and inference speed,especially for small objects,e.g.,4%to 5%higher mAP(mean average precision)than SSD,and 2%to 3%higher mAP than DSSD.On VOC 2007 test set,the NFPN-based SSD with 300×300 input reaches 79.4%mAP at 34.6 frame/s,and the mAP can raise to 82.9%after using the multi-scale testing strategy.
基金This work was supported by the National Natural Science Foundation of China(Nos.U1833103,71801215)the China Civil Aviation Environment and Sustainable Development Research Center Open Fund(No.CESCA2019Y04).
文摘With the continuous increase in the number of flights,the use of airport collaborative decision-making(ACDM)systems has been more and more widely spread.The accuracy of the taxi time prediction has an important effect on the A-CDM calculation of the departure aircraft’s take-off queue and the accurate time for the aircraft blockout.The spatial-temporal-environment deep learning(STEDL)model is presented to improve the prediction accuracy of departure aircraft taxi-out time.The model is composed of time-flow sub-model(airport capacity,number of taxiing aircraft,and different time periods),spatial sub-model(taxiing distance)and environmental sub-model(weather,air traffic control,runway configuration,and aircraft category).The STEDL model is used to predict the taxi time of departure aircraft at Hong Kong Airport and the results show that the STEDL method has a prediction accuracy of 95.4%.The proposed model also greatly reduces the prediction error rate compared with the other machine learning methods.
基金Scientific Research Project of the Education Department of Hunan Province(20C1435)Open Fund Project for Computer Science and Technology of Hunan University of Chinese Medicine(2018JK05).
文摘Objective To propose two novel methods based on deep learning for computer-aided tongue diagnosis,including tongue image segmentation and tongue color classification,improving their diagnostic accuracy.Methods LabelMe was used to label the tongue mask and Snake model to optimize the labeling results.A new dataset was constructed for tongue image segmentation.Tongue color was marked to build a classified dataset for network training.In this research,the Inception+Atrous Spatial Pyramid Pooling(ASPP)+UNet(IAUNet)method was proposed for tongue image segmentation,based on the existing UNet,Inception,and atrous convolution.Moreover,the Tongue Color Classification Net(TCCNet)was constructed with reference to ResNet,Inception,and Triple-Loss.Several important measurement indexes were selected to evaluate and compare the effects of the novel and existing methods for tongue segmentation and tongue color classification.IAUNet was compared with existing mainstream methods such as UNet and DeepLabV3+for tongue segmentation.TCCNet for tongue color classification was compared with VGG16 and GoogLeNet.Results IAUNet can accurately segment the tongue from original images.The results showed that the Mean Intersection over Union(MIoU)of IAUNet reached 96.30%,and its Mean Pixel Accuracy(MPA),mean Average Precision(mAP),F1-Score,G-Score,and Area Under Curve(AUC)reached 97.86%,99.18%,96.71%,96.82%,and 99.71%,respectively,suggesting IAUNet produced better segmentation than other methods,with fewer parameters.Triplet-Loss was applied in the proposed TCCNet to separate different embedded colors.The experiment yielded ideal results,with F1-Score and mAP of the TCCNet reached 88.86% and 93.49%,respectively.Conclusion IAUNet based on deep learning for tongue segmentation is better than traditional ones.IAUNet can not only produce ideal tongue segmentation,but have better effects than those of PSPNet,SegNet,UNet,and DeepLabV3+,the traditional networks.As for tongue color classification,the proposed network,TCCNet,had better F1-Score and mAP values as compared with other neural networks such as VGG16 and GoogLeNet.
文摘Objective We developed a universal lesion detector(ULDor)which showed good performance in in-lab experiments.The study aims to evaluate the performance and its ability to generalize in clinical setting via both external and internal validation.Methods The ULDor system consists of a convolutional neural network(CNN)trained on around 80 K lesion annotations from about 12 K CT studies in the DeepLesion dataset and 5 other public organ-specific datasets.During the validation process,the test sets include two parts:the external validation dataset which was comprised of 164 sets of non-contrasted chest and upper abdomen CT scans from a comprehensive hospital,and the internal validation dataset which was comprised of 187 sets of low-dose helical CT scans from the National Lung Screening Trial(NLST).We ran the model on the two test sets to output lesion detection.Three board-certified radiologists read the CT scans and verified the detection results of ULDor.We used positive predictive value(PPV)and sensitivity to evaluate the performance of the model in detecting space-occupying lesions at all extra-pulmonary organs visualized on CT images,including liver,kidney,pancreas,adrenal,spleen,esophagus,thyroid,lymph nodes,body wall,thoracic spine,etc.Results In the external validation,the lesion-level PPV and sensitivity of the model were 57.9%and 67.0%,respectively.On average,the model detected 2.1 findings per set,and among them,0.9 were false positives.ULDor worked well for detecting liver lesions,with a PPV of 78.9%and a sensitivity of 92.7%,followed by kidney,with a PPV of 70.0%and a sensitivity of 58.3%.In internal validation with NLST test set,ULDor obtained a PPV of 75.3%and a sensitivity of 52.0%despite the relatively high noise level of soft tissue on images.Conclusions The performance tests of ULDor with the external real-world data have shown its high effectiveness in multiple-purposed detection for lesions in certain organs.With further optimisation and iterative upgrades,ULDor may be well suited for extensive application to external data.
文摘In order to improve the accuracy of threaded hole object detection,combining a dual camera vision system with the Hough transform circle detection,we propose an object detection method of artifact threaded hole based on Faster region-ased convolutional neural network(Faster R-CNN).First,a dual camera image acquisition system is established.One industrial camera placed at a high position is responsible for collecting the whole image of the workpiece,and the suspected screw hole position on the workpiece can be preliminarily selected by Hough transform detection algorithm.Then,the other industrial camera is responsible for collecting the local images of the suspected screw holes that have been detected by Hough transform one by one.After that,ResNet50-based Faster R-CNN object detection model is trained on the self-built screw hole data set.Finally,the local image of the threaded hole is input into the trained Faster R-CNN object detection model for further identification and location.The experimental results show that the proposed method can effectively avoid small object detection of threaded holes,and compared with the method that only uses Hough transform or Faster RCNN object detection alone,it has high recognition and positioning accuracy.
基金Project(2020A1515010718)supported by the Basic and Applied Basic Research Foundation of Guangdong Province,China。
文摘Dense captioning aims to simultaneously localize and describe regions-of-interest(RoIs)in images in natural language.Specifically,we identify three key problems:1)dense and highly overlapping RoIs,making accurate localization of each target region challenging;2)some visually ambiguous target regions which are hard to recognize each of them just by appearance;3)an extremely deep image representation which is of central importance for visual recognition.To tackle these three challenges,we propose a novel end-to-end dense captioning framework consisting of a joint localization module,a contextual reasoning module and a deep convolutional neural network(CNN).We also evaluate five deep CNN structures to explore the benefits of each.Extensive experiments on visual genome(VG)dataset demonstrate the effectiveness of our approach,which compares favorably with the state-of-the-art methods.
基金supported by the National Natural Science Foundation of China (Nos. U1933117,61773202,52072174)。
文摘A data-driven method for arrival pattern recognition and prediction is proposed to provide air traffic controllers(ATCOs)with decision support. For arrival pattern recognition,a clustering-based method is proposed to cluster arrival patterns by control intentions. For arrival pattern prediction,two predictors are trained to estimate the most possible command issued by the ATCOs in a particular traffic situation. Training the arrival pattern predictor could be regarded as building an ATCOs simulator. The simulator can assign an appropriate arrival pattern for each arrival aircraft,just like real ATCOs do. Therefore,the simulator is considered to be able to provide effective advice for part of the work of ATCOs. Finally,a case study is carried out and demonstrates that the convolutional neural network(CNN)-based predictor performs better than the radom forest(RF)-based one.
文摘Single image super-resolution(SISR)is a fundamentally challenging problem because a low-resolution(LR)image can correspond to a set of high-resolution(HR)images,while most are not expected.Recently,SISR can be achieved by a deep learning-based method.By constructing a very deep super-resolution convolutional neural network(VDSRCNN),the LR images can be improved to HR images.This study mainly achieves two objectives:image super-resolution(ISR)and deblurring the image from VDSRCNN.Firstly,by analyzing ISR,we modify different training parameters to test the performance of VDSRCNN.Secondly,we add the motion blurred images to the training set to optimize the performance of VDSRCNN.Finally,we use image quality indexes to evaluate the difference between the images from classical methods and VDSRCNN.The results indicate that the VDSRCNN performs better in generating HR images from LR images using the optimized VDSRCNN in a proper method.
基金Project supported by the Strategic Priority Research Program of the Chinese Academy of Sciences(No.XDA23090203)the National Key Technologies Research and Development Program of China(No.2016YFB0502600)the Key Program of Sichuan Bureau of Science and Technology(No.2018SZ0350),China。
文摘In recent years,deep learning methods have gradually come to be used in hyperspectral imaging domains.Because of the peculiarity of hyperspectral imaging,a mass of information is contained in the spectral dimensions of hyperspectral images.Also,different ob jects on a land surface are sensitive to different ranges of wavelength.To achieve higher accuracy in classification,we propose a structure that combines spectral sensitivity with a convolutional neural network by adding spectral weights derived from predicted outcomes before the final classification layer.First,samples are divided into visible light and infrared,with a portion of the samples fed into networks during training.Then,two key parameters,unrecognized rate(δ)and wrongly recognized rate(γ),are calculated from the predicted outcome of the whole scene.Next,the spectral weight,derived from these two parameters,is calculated.Finally,the spectral weight is added and an improved structure is constructed.The improved structure not only combines the features in spatial and spectral dimensions,but also gives spectral sensitivity a primary status.Compared with inputs from the whole spectrum,the improved structure attains a nearly 2%higher prediction accuracy.When applied to public data sets,compared with the whole spectrum,on the average we achieve approximately 1%higher accuracy.
文摘Background:A colonoscopy can detect colorectal diseases,including cancers,polyps,and inflammatory bowel diseases.A computer-aided diagnosis(CAD)system using deep convolutional neural networks(CNNs)that can recognize anatomical locations during a colonoscopy could efficiently assist practitioners.We aimed to construct a CAD system using a CNN to distinguish colorectal images from parts of the cecum,ascending colon,transverse colon,descending colon,sigmoid colon,and rectum.Method:We constructed a CNN by training of 9,995 colonoscopy images and tested its performance by 5,121 independent colonoscopy images that were categorized according to seven anatomical locations:the terminal ileum,the cecum,ascending colon to transverse colon,descending colon to sigmoid colon,the rectum,the anus,and indistinguishable parts.We examined images taken during total colonoscopy performed between January 2017 and November 2017 at a single center.We evaluated the concordance between the diagnosis by endoscopists and those by the CNN.The main outcomes of the study were the sensitivity and specificity of the CNN for the anatomical categorization of colonoscopy images.Results:The constructed CNN recognized anatomical locations of colonoscopy images with the following areas under the curves:0.979 for the terminal ileum;0.940 for the cecum;0.875 for ascending colon to transverse colon;0.846 for descending colon to sigmoid colon;0.835 for the rectum;and 0.992 for the anus.During the test process,the CNN system correctly recognized 66.6%of images.Conclusion:We constructed the new CNN system with clinically relevant performance for recognizing anatomical locations of colonoscopy images,which is the first step in constructing a CAD system that will support us during colonoscopy and provide an assurance of the quality of the colonoscopy procedure.