The classification and identification of brain diseases with multimodal information have attracted increasing attention in the domain of computer-aided. Compared with traditional method which use single modal feature ...The classification and identification of brain diseases with multimodal information have attracted increasing attention in the domain of computer-aided. Compared with traditional method which use single modal feature information, multiple modal information fusion can classify and diagnose brain diseases more comprehensively and accurately in patient subjects. Existing multimodal methods require manual extraction of features or additional personal information, which consumes a lot of manual work. Furthermore, the difference between different modal images along with different manual feature extraction make it difficult for models to learn the optimal solution. In this paper, we propose a multimodal 3D convolutional neural networks framework for classification of brain disease diagnosis using MR images data and PET images data of subjects. We demonstrate the performance of the proposed approach for classification of Alzheimer’s disease (AD) versus mild cognitive impairment (MCI) and normal controls (NC) on the Alzheimer’s Disease National Initiative (ADNI) data set of 3D structural MRI brain scans and FDG-PET images. Experimental results show that the performance of the proposed method for AD vs. NC, MCI vs. NC are 93.55% and 78.92% accuracy respectively. And the accuracy of the results of AD, MCI and NC 3-classification experiments is 68.86%.展开更多
Since Multimode data is composed of many modes and their complex relationships,it cannot be retrieved or mined effectively by utilizing traditional analysis and processing techniques for single mode data.To address th...Since Multimode data is composed of many modes and their complex relationships,it cannot be retrieved or mined effectively by utilizing traditional analysis and processing techniques for single mode data.To address the challenges,we design and implement a graph-based storage and parallel loading system aimed at multimode medical image data.The system is a framework designed to flexibly store and rapidly load these multimode data.Specifically,the system utilizes the Mode Network to model the modes and their relationships in multimode medical image data and the graph database to store the data with a parallel loading technique.展开更多
This paper describes an automatic system for 3D big data of face modeling using front and side view images taken by an ordinary digital camera, whose directions are orthogonal. The paper consists of four keys in 3D vi...This paper describes an automatic system for 3D big data of face modeling using front and side view images taken by an ordinary digital camera, whose directions are orthogonal. The paper consists of four keys in 3D visualization. Firstly we study the 3D big data of face modeling including feature facial extraction from 2D images. The second part is to represent the technical from Computer Vision, Image Processing and my new method for extract information from images and create 3D model. Thirdly, 3D face modeling based on 2D image software is implemented by C# language, EMGU CV library and XNA framework. Finally, we design experiment, test and record results for measure performance of our method.展开更多
Breast cancer is the most ordinary malignant tumor in women worldwide. Early breast cancer screening is the key to reduce mortality. Clinical trials have shown that Computer Aided Design improves the accuracy of breas...Breast cancer is the most ordinary malignant tumor in women worldwide. Early breast cancer screening is the key to reduce mortality. Clinical trials have shown that Computer Aided Design improves the accuracy of breast cancer detection. Segmentation of mammography is a critical step in Computer Aided Design. In recent years, FCN has been applied in the field of image segmentation. Generative Adversarial Networks is also popularized for its ability on generate images which is difficult to distinguish from real images, and have been applied in the image semantic segmentation domain. We apply the Dilated Convolutions to the partial convolutional layer of the Multi-FCN and use the ideas of Generative Adversarial Networks to train and correct our segmentation network. Experiments show that the Dice index of the model DMulti- FCN-CRF-Adversarial Training on the datasets InBreast and DDSMBCRP can be increased to 91.15% and 91.8%.展开更多
The development of medical images acquisition and storage technology has led to the rapid growth of the relevant data.Retrieval of similar medical images can effectively help doctors to diagnose diseases more accurate...The development of medical images acquisition and storage technology has led to the rapid growth of the relevant data.Retrieval of similar medical images can effectively help doctors to diagnose diseases more accurately.But because of the particularity of medical images,traditional contentbased image retrieval(CBIR)method such as bag-of-words(BOW)cannot be applied to medical images.For example,when retrieving a diseased image,we should not only consider the similar characteristics but also need to consider the type of lesion.And for medical images,images with the same lesion may have different image features,similar images may have different types of lesions.In this paper,a Markov random field(MRF)is structured,and an approximate belief propagation algorithm is used to retrieval images.An adjust-ranking step after initial retrieval is incorporated to further improve the retrieval performance.This paper uses the real brain CT images.The experimental results show that the proposed method can significantly improve the retrieval accuracy and has good efficiency.展开更多
The popularity of social network services has caused the rapid growth of the users. To predict the links between users has been recognized as one of the key tasks in social network analysis. Most of the present link p...The popularity of social network services has caused the rapid growth of the users. To predict the links between users has been recognized as one of the key tasks in social network analysis. Most of the present link prediction methods either analyze the topology structure of social network graph or just concern the user’s interests. These will lead to the low accuracy of prediction.Furthermore, the large amount of user interest information increases the difficulties for common interest extraction. In order to solve the above problems, this paper proposes a joint social network link prediction method-JLPM.Firstly, we give the problem formulation. Secondly, we define a joint prediction feature model(JPFM) to describe user interest topic feature and network topology structure feature synthetically, and present corresponding feature extracting algorithm. JPFM uses the LDA topic model to extract user interest topics and uses a random walk algorithm to extract the network topology features. Thirdly,by transforming the link prediction problem to a classification problem, we use the typical SVM classifier to predict the possible links. Finally, experimental results on citation data set show the feasibility of our method.展开更多
Skin lesion classification in the dermoscopy images exerts an enormous function on the improvement of diagnostic performance and reduction of melanoma deaths. This skin lesion classification task remains a challenge. ...Skin lesion classification in the dermoscopy images exerts an enormous function on the improvement of diagnostic performance and reduction of melanoma deaths. This skin lesion classification task remains a challenge. Deep learning requires a lot of training data, and the classification algorithms of skin lesions have certain limitations. These two points make the accuracy of the skin lesion classification needs to be further improved. In this paper, a mutual learning model was presented to separate malignant from benign skin lesions using the skin dataset. This model enabled dual deep convolutional neural networks to mutually learn from each other. Experimental results on the ISIC 2016 Skin Lesion Classification dataset indicate that the mutual learning model obtains the most advanced performance.展开更多
Medical images are important for medical research and clinical diagnosis.The research of medical images includes image acquisition,processing,analysis and other related research fields.Crowdsourcing is attracting grow...Medical images are important for medical research and clinical diagnosis.The research of medical images includes image acquisition,processing,analysis and other related research fields.Crowdsourcing is attracting growing interests in recent years as an effective tool.It can harness human intelligence to solve problems that computers cannot perform well,such as sentiment analysis and image recognition.Crowdsourcing can achieve higher accuracies in medical image classification,but it cannot be widely used for its low efficiency and the monetary cost.We adopt a hybrid approach which combines computer’s algorithm and crowdsourcing system for image classification.Medical image classification algorithms have a high error rate near the threshold.And it is not significant by improving these classification algorithms to achieve a higher accuracy.To address the problem,we propose a hybrid framework,which can achieve a higher accuracy significantly than only use classification algorithms.At the same time,it only processes the images that classification algorithms perform not well,so it has a lower monetary cost.In the framework,we device an effective algorithm to generate a range-threshold that assign images to crowdsourcing or classification algorithm.Experimental results show that our method can improve the accuracy of medical images classification and reduce the crowdsourcing monetary cost.展开更多
As a deep learning network with an encoder-decoder architecture,UNet and its series of improved versions have been widely used in medical image segmentation with great applications.However,when used to segment targets...As a deep learning network with an encoder-decoder architecture,UNet and its series of improved versions have been widely used in medical image segmentation with great applications.However,when used to segment targets in 3D medical images such as magnetic resonance imaging(MRI),computed tomography(CT),these models do not model the relevance of images in vertical space,resulting in poor accurate analysis of consecutive slices of the same patient.On the other hand,the large amount of detail lost during the encoding process makes these models incapable of segmenting small-scale tumor targets.Aiming at the scene of small-scale target segmentation in 3D medical images,a fully new neural network model SUNet++is proposed on the basis of UNet and UNet++.SUNet++improves the existing models mainly in three aspects:1)the modeling strategy of slice superposition is used to thoroughly excavate the three dimensional information of the data;2)by adding an attention mechanism during the decoding process,small scale targets in the picture are retained and amplified;3)in the up-sampling process,the transposed convolution operation is used to further enhance the effect of the model.In order to verify the effect of the model,we collected and produced a dataset of hyperintensity MRI liver-stage images containing over 400 cases of liver nodules.Experimental results on both public and proprietary datasets demonstrate the superiority of SUNet++in small-scale target segmentation of three-dimensional medical images.展开更多
The virtual test platform is a vital tool for ship simulation and testing.However,the numerical pool ship virtual test platform is a complex system that comprises multiple heterogeneous data types,such as relational d...The virtual test platform is a vital tool for ship simulation and testing.However,the numerical pool ship virtual test platform is a complex system that comprises multiple heterogeneous data types,such as relational data,files,text,images,and animations.The analysis,evaluation,and decision-making processes heavily depend on data,which continue to increase in size and complexity.As a result,there is an increasing need for a distributed database system to manage these data.In this paper,we propose a Key-Value database based on a distributed system that can operate on any type of data,regardless of its size or type.This database architecture supports class column storage and load balancing and optimizes the efficiency of I/O bandwidth and CPU resource utilization.Moreover,it is specif-ically designed to handle the storage and access of largefiles.Additionally,we propose a multimodal data fusion mechanism that can connect various descrip-tions of the same substance,enabling the fusion and retrieval of heterogeneous multimodal data to facilitate data analysis.Our approach focuses on indexing and storage,and we compare our solution with Redis,MongoDB,and MySQL through experiments.We demonstrate the performance,scalability,and reliability of our proposed database system while also analysing its architecture’s defects and providing optimization solutions and future research directions.In conclu-sion,our database system provides an efficient and reliable solution for the data management of the virtual test platform of numerical pool ships.展开更多
Lesion detection in Computed Tomography(CT) images is a challenging task in the field of computer-aided diagnosis.An important issue is to locate the area of lesion accurately.As a branch of Convolutional Neural Netwo...Lesion detection in Computed Tomography(CT) images is a challenging task in the field of computer-aided diagnosis.An important issue is to locate the area of lesion accurately.As a branch of Convolutional Neural Networks(CNNs),3D Context-Enhanced(3DCE) frameworks are designed to detect lesions on CT scans.The False Positives(FPs) detected in 3DCE frameworks are usually caused by inaccurate region proposals,which slow down the inference time.To solve the above problems,a new method is proposed,a dimension-decomposition region proposal network is integrated into 3DCE framework to improve the location accuracy in lesion detection.Without the restriction of "anchors" on ratios and scales,anchors are decomposed to independent "anchor strings".Anchor segments are dynamically combined in accordance with probability,and anchor strings with different lengths dynamically compose bounding boxes.Experiments show that the accurate region proposals generated by our model promote the sensitivity of FPs and spend less inference time compared with the current methods.展开更多
Skin melanoma is one of the most common malignant tumorsoriginating from melanocytes, and the incidence of the Chinese populationis showing a continuous increasing trend. Early and accurate diagnosisof melanoma has gr...Skin melanoma is one of the most common malignant tumorsoriginating from melanocytes, and the incidence of the Chinese populationis showing a continuous increasing trend. Early and accurate diagnosisof melanoma has great significance for guiding clinical treatment.However, the symptoms of malignant melanoma are not obvious in theearly stage. It is difficult to be diagnosed with human observation. Meanwhile,it is easy to spread due to missed diagnosis. In order to accuratelydiagnose melanoma, end-to-end skin lesion attribute segmentation frameworkis presented in this paper. It is applied to facilitate the digitalizationprocess of attributes segmentation. The framework was improved on theU-Net construction that use the channel context feature fusion modulebetween the encoder and decoder to further merge context information. Adual-domain attention module is proposed to get more effective informationfrom the feature map. It shows that the proposed method effectivelysegments the lesion attributes and achieves good result in the ISIC2018task2 dataset.展开更多
Detection efficiency plays an increasingly important role in object detection tasks.One-stage methods are widely adopted in real life because of their high efficiency especially in some real-time detection tasks such ...Detection efficiency plays an increasingly important role in object detection tasks.One-stage methods are widely adopted in real life because of their high efficiency especially in some real-time detection tasks such as face recognition and self-driving cars.RetinaMask achieves significant progress in the field of one-stage detectors by adding a semantic segmentation branch,but it has limitation in detecting multi-scale objects.To solve this problem,this paper proposes RetinaMask with Gate(RMG)model,consisting of four main modules.It develops RetinaMask with a gate mechanism,which extracts and combines features at different levels more effectively according to the size of objects.It firstly extracted multi-level features from input image by ResNet.Secondly,it constructed a fused feature pyramid through feature pyramid network,then gate mechanism was employed to adaptively enhance and integrate features at various scales with the respect to the size of object.Finally,three prediction heads were added for classification,localization and mask prediction,driving the model to learn with mask prediction.The predictions of all levels were integrated during the post-processing.The augment network shows better performance in object detection without the increase of computation cost and inference time,especially for small objects.展开更多
基金the National Natural Science Foundation of China under Grant No. 61672181, No. 51679058Natural Science Foundation of Heilongjiang Province under Grant No. F2016005. We would like to thank our teacher for guiding this paper. We would also like to thank classmates for their encouragement and help.
文摘The classification and identification of brain diseases with multimodal information have attracted increasing attention in the domain of computer-aided. Compared with traditional method which use single modal feature information, multiple modal information fusion can classify and diagnose brain diseases more comprehensively and accurately in patient subjects. Existing multimodal methods require manual extraction of features or additional personal information, which consumes a lot of manual work. Furthermore, the difference between different modal images along with different manual feature extraction make it difficult for models to learn the optimal solution. In this paper, we propose a multimodal 3D convolutional neural networks framework for classification of brain disease diagnosis using MR images data and PET images data of subjects. We demonstrate the performance of the proposed approach for classification of Alzheimer’s disease (AD) versus mild cognitive impairment (MCI) and normal controls (NC) on the Alzheimer’s Disease National Initiative (ADNI) data set of 3D structural MRI brain scans and FDG-PET images. Experimental results show that the performance of the proposed method for AD vs. NC, MCI vs. NC are 93.55% and 78.92% accuracy respectively. And the accuracy of the results of AD, MCI and NC 3-classification experiments is 68.86%.
文摘Since Multimode data is composed of many modes and their complex relationships,it cannot be retrieved or mined effectively by utilizing traditional analysis and processing techniques for single mode data.To address the challenges,we design and implement a graph-based storage and parallel loading system aimed at multimode medical image data.The system is a framework designed to flexibly store and rapidly load these multimode data.Specifically,the system utilizes the Mode Network to model the modes and their relationships in multimode medical image data and the graph database to store the data with a parallel loading technique.
基金The paper is partly supported by: 1. The Fund of PHD Supervisor from China Institute Committee (20132304110018). 2. The Natural Fund of Hei Longjiang Province (F201246). 3. The National Natural Science Foundation of China under Grant (61272184).
文摘This paper describes an automatic system for 3D big data of face modeling using front and side view images taken by an ordinary digital camera, whose directions are orthogonal. The paper consists of four keys in 3D visualization. Firstly we study the 3D big data of face modeling including feature facial extraction from 2D images. The second part is to represent the technical from Computer Vision, Image Processing and my new method for extract information from images and create 3D model. Thirdly, 3D face modeling based on 2D image software is implemented by C# language, EMGU CV library and XNA framework. Finally, we design experiment, test and record results for measure performance of our method.
基金the National Natural Science Foundation of China under Grant No. 61672181, No. 51679058Natural Science Foundation of Heilongjiang Province under Grant No. F2016005.
文摘Breast cancer is the most ordinary malignant tumor in women worldwide. Early breast cancer screening is the key to reduce mortality. Clinical trials have shown that Computer Aided Design improves the accuracy of breast cancer detection. Segmentation of mammography is a critical step in Computer Aided Design. In recent years, FCN has been applied in the field of image segmentation. Generative Adversarial Networks is also popularized for its ability on generate images which is difficult to distinguish from real images, and have been applied in the image semantic segmentation domain. We apply the Dilated Convolutions to the partial convolutional layer of the Multi-FCN and use the ideas of Generative Adversarial Networks to train and correct our segmentation network. Experiments show that the Dice index of the model DMulti- FCN-CRF-Adversarial Training on the datasets InBreast and DDSMBCRP can be increased to 91.15% and 91.8%.
文摘The development of medical images acquisition and storage technology has led to the rapid growth of the relevant data.Retrieval of similar medical images can effectively help doctors to diagnose diseases more accurately.But because of the particularity of medical images,traditional contentbased image retrieval(CBIR)method such as bag-of-words(BOW)cannot be applied to medical images.For example,when retrieving a diseased image,we should not only consider the similar characteristics but also need to consider the type of lesion.And for medical images,images with the same lesion may have different image features,similar images may have different types of lesions.In this paper,a Markov random field(MRF)is structured,and an approximate belief propagation algorithm is used to retrieval images.An adjust-ranking step after initial retrieval is incorporated to further improve the retrieval performance.This paper uses the real brain CT images.The experimental results show that the proposed method can significantly improve the retrieval accuracy and has good efficiency.
基金This research is supported by the Natural Science Foundation of China(No.61202090, 61272184, 61370084) , the Program for New Century Excellent Talents in University No.NCET-11-0829, Natural Science Foundation of HeiLong-Jiang Province(No. F201130), and Fundamental Research Funds for the Central Universities under grant No HEUCF100609.
文摘The popularity of social network services has caused the rapid growth of the users. To predict the links between users has been recognized as one of the key tasks in social network analysis. Most of the present link prediction methods either analyze the topology structure of social network graph or just concern the user’s interests. These will lead to the low accuracy of prediction.Furthermore, the large amount of user interest information increases the difficulties for common interest extraction. In order to solve the above problems, this paper proposes a joint social network link prediction method-JLPM.Firstly, we give the problem formulation. Secondly, we define a joint prediction feature model(JPFM) to describe user interest topic feature and network topology structure feature synthetically, and present corresponding feature extracting algorithm. JPFM uses the LDA topic model to extract user interest topics and uses a random walk algorithm to extract the network topology features. Thirdly,by transforming the link prediction problem to a classification problem, we use the typical SVM classifier to predict the possible links. Finally, experimental results on citation data set show the feasibility of our method.
基金the National Natural Science Foundation of China under Grant No. 61672181, No. 51679058, Natural Science Foundation of Heilongjiang Province under Grant No. F2016005. We would like to thank our teacher for guiding this paper. We would also like to thank classmates for their encouragement and help. We acknowledged the International Skin Imaging Collaboration (ISIC) for the publication of the ISIC 2016 Skin Lesion Classification Dataset. In the meantime, We would like to thank the scholars cited in this paper for their support and answers.
文摘Skin lesion classification in the dermoscopy images exerts an enormous function on the improvement of diagnostic performance and reduction of melanoma deaths. This skin lesion classification task remains a challenge. Deep learning requires a lot of training data, and the classification algorithms of skin lesions have certain limitations. These two points make the accuracy of the skin lesion classification needs to be further improved. In this paper, a mutual learning model was presented to separate malignant from benign skin lesions using the skin dataset. This model enabled dual deep convolutional neural networks to mutually learn from each other. Experimental results on the ISIC 2016 Skin Lesion Classification dataset indicate that the mutual learning model obtains the most advanced performance.
文摘Medical images are important for medical research and clinical diagnosis.The research of medical images includes image acquisition,processing,analysis and other related research fields.Crowdsourcing is attracting growing interests in recent years as an effective tool.It can harness human intelligence to solve problems that computers cannot perform well,such as sentiment analysis and image recognition.Crowdsourcing can achieve higher accuracies in medical image classification,but it cannot be widely used for its low efficiency and the monetary cost.We adopt a hybrid approach which combines computer’s algorithm and crowdsourcing system for image classification.Medical image classification algorithms have a high error rate near the threshold.And it is not significant by improving these classification algorithms to achieve a higher accuracy.To address the problem,we propose a hybrid framework,which can achieve a higher accuracy significantly than only use classification algorithms.At the same time,it only processes the images that classification algorithms perform not well,so it has a lower monetary cost.In the framework,we device an effective algorithm to generate a range-threshold that assign images to crowdsourcing or classification algorithm.Experimental results show that our method can improve the accuracy of medical images classification and reduce the crowdsourcing monetary cost.
基金This work was supported by the National Natural Science Foundation of China(No.62072135)Natural Science Foundation of Ningxia Hui Autonomous Region(No.2022AAC03346)Fundamental Research Funds for the Central Universities(No.3072020CF0602).
文摘As a deep learning network with an encoder-decoder architecture,UNet and its series of improved versions have been widely used in medical image segmentation with great applications.However,when used to segment targets in 3D medical images such as magnetic resonance imaging(MRI),computed tomography(CT),these models do not model the relevance of images in vertical space,resulting in poor accurate analysis of consecutive slices of the same patient.On the other hand,the large amount of detail lost during the encoding process makes these models incapable of segmenting small-scale tumor targets.Aiming at the scene of small-scale target segmentation in 3D medical images,a fully new neural network model SUNet++is proposed on the basis of UNet and UNet++.SUNet++improves the existing models mainly in three aspects:1)the modeling strategy of slice superposition is used to thoroughly excavate the three dimensional information of the data;2)by adding an attention mechanism during the decoding process,small scale targets in the picture are retained and amplified;3)in the up-sampling process,the transposed convolution operation is used to further enhance the effect of the model.In order to verify the effect of the model,we collected and produced a dataset of hyperintensity MRI liver-stage images containing over 400 cases of liver nodules.Experimental results on both public and proprietary datasets demonstrate the superiority of SUNet++in small-scale target segmentation of three-dimensional medical images.
文摘The virtual test platform is a vital tool for ship simulation and testing.However,the numerical pool ship virtual test platform is a complex system that comprises multiple heterogeneous data types,such as relational data,files,text,images,and animations.The analysis,evaluation,and decision-making processes heavily depend on data,which continue to increase in size and complexity.As a result,there is an increasing need for a distributed database system to manage these data.In this paper,we propose a Key-Value database based on a distributed system that can operate on any type of data,regardless of its size or type.This database architecture supports class column storage and load balancing and optimizes the efficiency of I/O bandwidth and CPU resource utilization.Moreover,it is specif-ically designed to handle the storage and access of largefiles.Additionally,we propose a multimodal data fusion mechanism that can connect various descrip-tions of the same substance,enabling the fusion and retrieval of heterogeneous multimodal data to facilitate data analysis.Our approach focuses on indexing and storage,and we compare our solution with Redis,MongoDB,and MySQL through experiments.We demonstrate the performance,scalability,and reliability of our proposed database system while also analysing its architecture’s defects and providing optimization solutions and future research directions.In conclu-sion,our database system provides an efficient and reliable solution for the data management of the virtual test platform of numerical pool ships.
基金supported by the National Natural Science Foundation of China (Nos. 62072135, 61672181)。
文摘Lesion detection in Computed Tomography(CT) images is a challenging task in the field of computer-aided diagnosis.An important issue is to locate the area of lesion accurately.As a branch of Convolutional Neural Networks(CNNs),3D Context-Enhanced(3DCE) frameworks are designed to detect lesions on CT scans.The False Positives(FPs) detected in 3DCE frameworks are usually caused by inaccurate region proposals,which slow down the inference time.To solve the above problems,a new method is proposed,a dimension-decomposition region proposal network is integrated into 3DCE framework to improve the location accuracy in lesion detection.Without the restriction of "anchors" on ratios and scales,anchors are decomposed to independent "anchor strings".Anchor segments are dynamically combined in accordance with probability,and anchor strings with different lengths dynamically compose bounding boxes.Experiments show that the accurate region proposals generated by our model promote the sensitivity of FPs and spend less inference time compared with the current methods.
基金The paper is supported by the National Natural Science Foundation of China under Grant No.62072135 and No.61672181.
文摘Skin melanoma is one of the most common malignant tumorsoriginating from melanocytes, and the incidence of the Chinese populationis showing a continuous increasing trend. Early and accurate diagnosisof melanoma has great significance for guiding clinical treatment.However, the symptoms of malignant melanoma are not obvious in theearly stage. It is difficult to be diagnosed with human observation. Meanwhile,it is easy to spread due to missed diagnosis. In order to accuratelydiagnose melanoma, end-to-end skin lesion attribute segmentation frameworkis presented in this paper. It is applied to facilitate the digitalizationprocess of attributes segmentation. The framework was improved on theU-Net construction that use the channel context feature fusion modulebetween the encoder and decoder to further merge context information. Adual-domain attention module is proposed to get more effective informationfrom the feature map. It shows that the proposed method effectivelysegments the lesion attributes and achieves good result in the ISIC2018task2 dataset.
基金the National Natural Science Foundation of China under Grant No.61672181。
文摘Detection efficiency plays an increasingly important role in object detection tasks.One-stage methods are widely adopted in real life because of their high efficiency especially in some real-time detection tasks such as face recognition and self-driving cars.RetinaMask achieves significant progress in the field of one-stage detectors by adding a semantic segmentation branch,but it has limitation in detecting multi-scale objects.To solve this problem,this paper proposes RetinaMask with Gate(RMG)model,consisting of four main modules.It develops RetinaMask with a gate mechanism,which extracts and combines features at different levels more effectively according to the size of objects.It firstly extracted multi-level features from input image by ResNet.Secondly,it constructed a fused feature pyramid through feature pyramid network,then gate mechanism was employed to adaptively enhance and integrate features at various scales with the respect to the size of object.Finally,three prediction heads were added for classification,localization and mask prediction,driving the model to learn with mask prediction.The predictions of all levels were integrated during the post-processing.The augment network shows better performance in object detection without the increase of computation cost and inference time,especially for small objects.