The airport apron scene contains rich contextual information about the spatial position relationship.Traditional object detectors only considered visual appearance and ignored the contextual information.In addition,th...The airport apron scene contains rich contextual information about the spatial position relationship.Traditional object detectors only considered visual appearance and ignored the contextual information.In addition,the detection accuracy of some categories in the apron dataset was low.Therefore,an improved object detection method using spatial-aware features in apron scenes called SA-FRCNN is presented.The method uses graph convolutional networks to capture the relative spatial relationship between objects in the apron scene,incorporating this spatial context into feature learning.Moreover,an attention mechanism is introduced into the feature extraction process,with the goal to focus on the spatial position and key features,and distance-IoU loss is used to achieve a more accurate regression.The experimental results show that the mean average precision of the apron object detection based on SAFRCNN can reach 95.75%,and the detection effect of some hard-to-detect categories has been significantly improved.The proposed method effectively improves the detection accuracy on the apron dataset,which has a leading advantage over other methods.展开更多
Dense captioning aims to simultaneously localize and describe regions-of-interest(RoIs)in images in natural language.Specifically,we identify three key problems:1)dense and highly overlapping RoIs,making accurate loca...Dense captioning aims to simultaneously localize and describe regions-of-interest(RoIs)in images in natural language.Specifically,we identify three key problems:1)dense and highly overlapping RoIs,making accurate localization of each target region challenging;2)some visually ambiguous target regions which are hard to recognize each of them just by appearance;3)an extremely deep image representation which is of central importance for visual recognition.To tackle these three challenges,we propose a novel end-to-end dense captioning framework consisting of a joint localization module,a contextual reasoning module and a deep convolutional neural network(CNN).We also evaluate five deep CNN structures to explore the benefits of each.Extensive experiments on visual genome(VG)dataset demonstrate the effectiveness of our approach,which compares favorably with the state-of-the-art methods.展开更多
The authors give some sufficient conditions for the difference of two closed convex sets to be closed in general Banach spaces, not necessarily reflexive.
Background:A colonoscopy can detect colorectal diseases,including cancers,polyps,and inflammatory bowel diseases.A computer-aided diagnosis(CAD)system using deep convolutional neural networks(CNNs)that can recognize a...Background:A colonoscopy can detect colorectal diseases,including cancers,polyps,and inflammatory bowel diseases.A computer-aided diagnosis(CAD)system using deep convolutional neural networks(CNNs)that can recognize anatomical locations during a colonoscopy could efficiently assist practitioners.We aimed to construct a CAD system using a CNN to distinguish colorectal images from parts of the cecum,ascending colon,transverse colon,descending colon,sigmoid colon,and rectum.Method:We constructed a CNN by training of 9,995 colonoscopy images and tested its performance by 5,121 independent colonoscopy images that were categorized according to seven anatomical locations:the terminal ileum,the cecum,ascending colon to transverse colon,descending colon to sigmoid colon,the rectum,the anus,and indistinguishable parts.We examined images taken during total colonoscopy performed between January 2017 and November 2017 at a single center.We evaluated the concordance between the diagnosis by endoscopists and those by the CNN.The main outcomes of the study were the sensitivity and specificity of the CNN for the anatomical categorization of colonoscopy images.Results:The constructed CNN recognized anatomical locations of colonoscopy images with the following areas under the curves:0.979 for the terminal ileum;0.940 for the cecum;0.875 for ascending colon to transverse colon;0.846 for descending colon to sigmoid colon;0.835 for the rectum;and 0.992 for the anus.During the test process,the CNN system correctly recognized 66.6%of images.Conclusion:We constructed the new CNN system with clinically relevant performance for recognizing anatomical locations of colonoscopy images,which is the first step in constructing a CAD system that will support us during colonoscopy and provide an assurance of the quality of the colonoscopy procedure.展开更多
基金supported by the Fundamental Research Funds for Central Universities of the Civil Aviation University of China(No.3122021088).
文摘The airport apron scene contains rich contextual information about the spatial position relationship.Traditional object detectors only considered visual appearance and ignored the contextual information.In addition,the detection accuracy of some categories in the apron dataset was low.Therefore,an improved object detection method using spatial-aware features in apron scenes called SA-FRCNN is presented.The method uses graph convolutional networks to capture the relative spatial relationship between objects in the apron scene,incorporating this spatial context into feature learning.Moreover,an attention mechanism is introduced into the feature extraction process,with the goal to focus on the spatial position and key features,and distance-IoU loss is used to achieve a more accurate regression.The experimental results show that the mean average precision of the apron object detection based on SAFRCNN can reach 95.75%,and the detection effect of some hard-to-detect categories has been significantly improved.The proposed method effectively improves the detection accuracy on the apron dataset,which has a leading advantage over other methods.
基金Project(2020A1515010718)supported by the Basic and Applied Basic Research Foundation of Guangdong Province,China。
文摘Dense captioning aims to simultaneously localize and describe regions-of-interest(RoIs)in images in natural language.Specifically,we identify three key problems:1)dense and highly overlapping RoIs,making accurate localization of each target region challenging;2)some visually ambiguous target regions which are hard to recognize each of them just by appearance;3)an extremely deep image representation which is of central importance for visual recognition.To tackle these three challenges,we propose a novel end-to-end dense captioning framework consisting of a joint localization module,a contextual reasoning module and a deep convolutional neural network(CNN).We also evaluate five deep CNN structures to explore the benefits of each.Extensive experiments on visual genome(VG)dataset demonstrate the effectiveness of our approach,which compares favorably with the state-of-the-art methods.
文摘The authors give some sufficient conditions for the difference of two closed convex sets to be closed in general Banach spaces, not necessarily reflexive.
文摘Background:A colonoscopy can detect colorectal diseases,including cancers,polyps,and inflammatory bowel diseases.A computer-aided diagnosis(CAD)system using deep convolutional neural networks(CNNs)that can recognize anatomical locations during a colonoscopy could efficiently assist practitioners.We aimed to construct a CAD system using a CNN to distinguish colorectal images from parts of the cecum,ascending colon,transverse colon,descending colon,sigmoid colon,and rectum.Method:We constructed a CNN by training of 9,995 colonoscopy images and tested its performance by 5,121 independent colonoscopy images that were categorized according to seven anatomical locations:the terminal ileum,the cecum,ascending colon to transverse colon,descending colon to sigmoid colon,the rectum,the anus,and indistinguishable parts.We examined images taken during total colonoscopy performed between January 2017 and November 2017 at a single center.We evaluated the concordance between the diagnosis by endoscopists and those by the CNN.The main outcomes of the study were the sensitivity and specificity of the CNN for the anatomical categorization of colonoscopy images.Results:The constructed CNN recognized anatomical locations of colonoscopy images with the following areas under the curves:0.979 for the terminal ileum;0.940 for the cecum;0.875 for ascending colon to transverse colon;0.846 for descending colon to sigmoid colon;0.835 for the rectum;and 0.992 for the anus.During the test process,the CNN system correctly recognized 66.6%of images.Conclusion:We constructed the new CNN system with clinically relevant performance for recognizing anatomical locations of colonoscopy images,which is the first step in constructing a CAD system that will support us during colonoscopy and provide an assurance of the quality of the colonoscopy procedure.