The main task of magnetic resonance imaging (MRI) automatic brain tumor segmentation is to automaticallysegment the brain tumor edema, peritumoral edema, endoscopic core, enhancing tumor core and nonenhancingtumor cor...The main task of magnetic resonance imaging (MRI) automatic brain tumor segmentation is to automaticallysegment the brain tumor edema, peritumoral edema, endoscopic core, enhancing tumor core and nonenhancingtumor core from 3D MR images. Because the location, size, shape and intensity of brain tumors vary greatly, itis very difficult to segment these brain tumor regions automatically. In this paper, by combining the advantagesof DenseNet and ResNet, we proposed a new 3D U-Net with dense encoder blocks and residual decoder blocks.We used dense blocks in the encoder part and residual blocks in the decoder part. The number of output featuremaps increases with the network layers in contracting path of encoder, which is consistent with the characteristicsof dense blocks. Using dense blocks can decrease the number of network parameters, deepen network layers,strengthen feature propagation, alleviate vanishing-gradient and enlarge receptive fields. The residual blockswere used in the decoder to replace the convolution neural block of original U-Net, which made the networkperformance better. Our proposed approach was trained and validated on the BraTS2019 training and validationdata set. We obtained dice scores of 0.901, 0.815 and 0.766 for whole tumor, tumor core and enhancing tumorcore respectively on the BraTS2019 validation data set. Our method has the better performance than the original3D U-Net. The results of our experiment demonstrate that compared with some state-of-the-art methods, ourapproach is a competitive automatic brain tumor segmentation method.展开更多
In the field of image denoising, deep learning technology holds a dominance. However, the current network model tends to lose fine-grained information with the depth of the network. To address this issue, this paper p...In the field of image denoising, deep learning technology holds a dominance. However, the current network model tends to lose fine-grained information with the depth of the network. To address this issue, this paper proposes a Multi-scale Attention Dilated Residual Image Denoising Network(MADRNet) based on skip connection, which consists of Dense Interval Transmission Block(DTB), Sparse Residual Block(SRB), Dilated Residual Attention Reconstruction Block(DRAB) and Noise Extraction Block(NEB). The DTB enhances the classical dense layer by reducing information redundancy and extracting more accurate feature information. Meanwhile, SRB improves feature information exchange and model generalization through the use of sparse mechanism and skip connection strategy with different expansion factors. The NEB is primarily responsible for extracting and estimating noise. Its output, together with that of the sparse residual module, acts on the DRAB to effectively prevent loss of shallow feature information and improve denoising effect. Furthermore, the DRAB integrates an dilated residual block into an attention mechanism to extract hidden noise information while using residual learning technology to reconstruct clear images. We respectively examined the performance of MADRNet in gray image denoising, color image denoising and real image denoising. The experiment results demonstrate that proposed network outperforms some excellent image denoising network in terms of peak signal-to-noise ratio, structural similarity index measurement and denoising time. The proposed network effectively addresses issues associated with the loss of detail information.展开更多
Generative adversarial networks(GANs)are paid more attention to dealing with the end-to-end speech enhancement in recent years.Various GANbased enhancement methods are presented to improve the quality of reconstructed...Generative adversarial networks(GANs)are paid more attention to dealing with the end-to-end speech enhancement in recent years.Various GANbased enhancement methods are presented to improve the quality of reconstructed speech.However,the performance of these GAN-based methods is worse than those of masking-based methods.To tackle this problem,we propose speech enhancement method with a residual dense generative adversarial network(RDGAN)contributing to map the log-power spectrum(LPS)of degraded speech to the clean one.In detail,a residual dense block(RDB)architecture is designed to better estimate the LPS of clean speech,which can extract rich local features of LPS through densely connected convolution layers.Meanwhile,sequential RDB connections are incorporated on various scales of LPS.It significantly increases the feature learning flexibility and robustness in the time-frequency domain.Simulations show that the proposed method achieves attractive speech enhancement performance in various acoustic environments.Specifically,in the untrained acoustic test with limited priors,e.g.,unmatched signal-to-noise ratio(SNR)and unmatched noise category,RDGAN can still outperform the existing GAN-based methods and masking-based method in the measures of PESQ and other evaluation indexes.It indicates that our method is more generalized in untrained conditions.展开更多
Masking-based and spectrum mapping-based methods are the two main algorithms of speech enhancement with deep neural network(DNN).But the mapping-based methods only utilizes the phase of noisy speech,which limits the u...Masking-based and spectrum mapping-based methods are the two main algorithms of speech enhancement with deep neural network(DNN).But the mapping-based methods only utilizes the phase of noisy speech,which limits the upper bound of speech enhancement performance.Maskingbased methods need to accurately estimate the masking which is still the key problem.Combining the advantages of above two types of methods,this paper proposes the speech enhancement algorithm MM-RDN(maskingmapping residual dense network)based on masking-mapping(MM)and residual dense network(RDN).Using the logarithmic power spectrogram(LPS)of consecutive frames,MM estimates the ideal ratio masking(IRM)matrix of consecutive frames.RDN can make full use of feature maps of all layers.Meanwhile,using the global residual learning to combine the shallow features and deep features,RDN obtains the global dense features from the LPS,thereby improves estimated accuracy of the IRM matrix.Simulations show that the proposed method achieves attractive speech enhancement performance in various acoustic environments.Specifically,in the untrained acoustic test with limited priors,e.g.,unmatched signal-to-noise ratio(SNR)and unmatched noise category,MM-RDN can still outperform the existing convolutional recurrent network(CRN)method in themeasures of perceptual evaluation of speech quality(PESQ)and other evaluation indexes.It indicates that the proposed algorithm is more generalized in untrained conditions.展开更多
In this study,an underwater image enhancement method based on multi-scale adversarial network was proposed to solve the problem of detail blur and color distortion in underwater images.Firstly,the local features of ea...In this study,an underwater image enhancement method based on multi-scale adversarial network was proposed to solve the problem of detail blur and color distortion in underwater images.Firstly,the local features of each layer were enhanced into the global features by the proposed residual dense block,which ensured that the generated images retain more details.Secondly,a multi-scale structure was adopted to extract multi-scale semantic features of the original images.Finally,the features obtained from the dual channels were fused by an adaptive fusion module to further optimize the features.The discriminant network adopted the structure of the Markov discriminator.In addition,by constructing mean square error,structural similarity,and perceived color loss function,the generated image is consistent with the reference image in structure,color,and content.The experimental results showed that the enhanced underwater image deblurring effect of the proposed algorithm was good and the problem of underwater image color bias was effectively improved.In both subjective and objective evaluation indexes,the experimental results of the proposed algorithm are better than those of the comparison algorithm.展开更多
Deep learning technologies are increasingly used in the fi eld of geophysics,and a variety of algorithms based on shallow convolutional neural networks are more widely used in fault recognition,but these methods are u...Deep learning technologies are increasingly used in the fi eld of geophysics,and a variety of algorithms based on shallow convolutional neural networks are more widely used in fault recognition,but these methods are usually not able to accurately identify complex faults.In this study,using the advantage of deep residual networks to capture strong learning features,we introduce residual blocks to replace all convolutional layers of the three-dimensional(3D)UNet to build a new 3D Res-UNet and select appropriate parameters through experiments to train a large amount of synthesized seismic data.After the training is completed,we introduce the mechanism of knowledge distillation.First,we treat the 3D Res-UNet as a teacher network and then train the 3D Res-UNet as a student network;in this process,the teacher network is in evaluation mode.Finally,we calculate the mixed loss function by combining the teacher model and student network to learn more fault information,improve the performance of the network,and optimize the fault recognition eff ect.The quantitative evaluation result of the synthetic model test proves that the 3D Res-UNet can considerably improve the accuracy of fault recognition from 0.956 to 0.993 after knowledge distillation,and the eff ectiveness and feasibility of our method can be verifi ed based on the application of actual seismic data.展开更多
Building extraction from high resolution remote sensing image is a key technology of digital city construction[14].In order to solve the problems of low efficiency and low precision of traditional remote sensing image...Building extraction from high resolution remote sensing image is a key technology of digital city construction[14].In order to solve the problems of low efficiency and low precision of traditional remote sensing image segmentation,an improved U-Net network structure is adopted in this paper.Firstly,in order to extract efficient building characteristic information,FPN structure was introduced to improve the ability of integrating multi-scale information in U-Net model;Secondly,to solve the problem that feature information weakens with the deepening of network depth,an efficient residual block network is introduced;Finally,In order to better distinguish the target area and background area in the image and improve the precision of building target edge detection,the cross entropy loss and Dice loss were linearly combined and weighted.Experimental results show that the algorithm can improve the image segmentation effect and improve the image accuracy by 18%.展开更多
In recent years,the problem of“Impolite Pedestrian”in front of the zebra crossing has aroused widespread concern from all walks of life.The traffic sector’s governance measures have become more serious.The traditio...In recent years,the problem of“Impolite Pedestrian”in front of the zebra crossing has aroused widespread concern from all walks of life.The traffic sector’s governance measures have become more serious.The traditional way of governance is on-site law enforcement,which requires a lot of manpower and material resources and is low efficiency.An enhanced YOLOv3-tiny model is proposed for pedestrians and vehicle detection in traffic monitoring.By modifying the backbone network structure of YOLOv3-tiny model,introducing deep detachable convolution operation,and designing the basic residual block unit of the network,the feature extraction ability of the backbone network is enhanced.The improved model is trained on the VOC2007+VOC2012 training set,and the trained model is tested for performance on the test data set.The experimental results show that:the mean Average Precision(mAP)increased from 0.672 to 0.732,increasing the measurement accuracy by 9%.The Intersection over Union(IoU)increased from 0.783 to 0.855,increasing the coverage accuracy by 7.2%.The enhanced YOLOv3-tiny model has higher measurement accuracy than the original model.Applying this model to the 1080P traffic video on the NVIDIA RTX 2080,the detection speed is 150 FPS,which can fully achieve real-time detection.Through the analysis of pedestrians and vehicle coordinates,it is judged whether or not illegal acts occur.For illegal vehicles,save three pictures as the basis for law enforcement,which forms an important supplement to off-site law enforcement.展开更多
Accurate pancreas segmentation is critical for the diagnosis and management of diseases of the pancreas. It is challenging to precisely delineate pancreas due to the highly variations in volume, shape and location. In...Accurate pancreas segmentation is critical for the diagnosis and management of diseases of the pancreas. It is challenging to precisely delineate pancreas due to the highly variations in volume, shape and location. In recent years, coarse-to-fine methods have been widely used to alleviate class imbalance issue and improve pancreas segmentation accuracy. However,cascaded methods could be computationally intensive and the refined results are significantly dependent on the performance of its coarse segmentation results. To balance the segmentation accuracy and computational efficiency, we propose a Discriminative Feature Attention Network for pancreas segmentation, to effectively highlight pancreas features and improve segmentation accuracy without explicit pancreas location. The final segmentation is obtained by applying a simple yet effective post-processing step. Two experiments on both public NIH pancreas CT dataset and abdominal BTCV multi-organ dataset are individually conducted to show the effectiveness of our method for 2 D pancreas segmentation. We obtained average Dice Similarity Coefficient(DSC) of 82.82±6.09%, average Jaccard Index(JI) of 71.13± 8.30% and average Symmetric Average Surface Distance(ASD) of 1.69 ± 0.83 mm on the NIH dataset. Compared to the existing deep learning-based pancreas segmentation methods, our experimental results achieve the best average DSC and JI value.展开更多
Image translation plays a significant role in realistic image synthesis,entertainment tasks such as editing and colorization,and security including personal identification.In Edge GAN,the major contribution is attribu...Image translation plays a significant role in realistic image synthesis,entertainment tasks such as editing and colorization,and security including personal identification.In Edge GAN,the major contribution is attribute guided vector that enables high visual quality content generation.This research study proposes automatic face image realism from freehand sketches based on Edge GAN.We propose a density variant image synthesis model,allowing the input sketch to encompass face features with minute details.The density level is projected into non-latent space,having a linear controlled function parameter.This assists the user to appropriately devise the variant densities of facial sketches and image synthesis.Composite data set of Large Scale CelebFaces Attributes(ClebA),Labelled Faces in theWild(LFWH),Chinese University of Hong Kong(CHUK),and self-generated Asian images are used to evaluate the proposed approach.The solution is validated to have the capability for generating realistic face images through quantitative and qualitative results and human evaluation.展开更多
Many networks are designed to stack a large number of residual blocks,deepen the network and improve network performance through short residual connec-tion,long residual connection,and dense connection.However,without...Many networks are designed to stack a large number of residual blocks,deepen the network and improve network performance through short residual connec-tion,long residual connection,and dense connection.However,without consider-ing different contributions of different depth features to the network,these de-signs have the problem of evaluating the importance of different depth features.To solve this problem,this paper proposes an adaptive densely residual net-work(ADRNet)for the single image super resolution.ADRN realizes the evalua-tion of distributions of different depth features and learns more representative features.An adaptive densely residual block(ADRB)was designed,combining 3 residual blocks(RB)and dense connection was added.It learned the attention score of each dense connection through adaptive dense connections,and the at-tention score reflected the importance of the features of each RB.To further en-hance the performance of ADRB,a multi-direction attention block(MDAB)was introduced to obtain multidirectional context information.Through comparative experiments,it is proved that theproposed ADRNet is superior to the existing methods.Through ablation experiments,it is proved that evaluating features of different depths helps to improve network performance.展开更多
Recently,with the urgent demand for data-driven approaches in practical industrial scenarios,the deep learning diagnosis model in noise environments has attracted increasing attention.However,the existing research has...Recently,with the urgent demand for data-driven approaches in practical industrial scenarios,the deep learning diagnosis model in noise environments has attracted increasing attention.However,the existing research has two limitations:(1)the complex and changeable environmental noise,which cannot ensure the high-performance diagnosis of the model in different noise domains and(2)the possibility of multiple faults occurring simultaneously,which brings challenges to the model diagnosis.This paper presents a novel anti-noise multi-scale convolutional neural network(AM-CNN)for solving the issue of compound fault diagnosis under different intensity noises.First,we propose a residual pre-processing block according to the principle of noise superposition to process the input information and present the residual loss to construct a new loss function.Additionally,considering the strong coupling of input information,we design a multi-scale convolution block to realize multi-scale feature extraction for enhancing the proposed model’s robustness and effectiveness.Finally,a multi-label classifier is utilized to simultaneously distinguish multiple bearing faults.The proposed AM-CNN is verified under our collected compound fault dataset.On average,AM-CNN improves 39.93%accuracy and 25.84%F1-macro under the no-noise working condition and 45.67%accuracy and 27.72%F1-macro under different intensity noise working conditions compared with the existing methods.Furthermore,the experimental results show that AM-CNN can achieve good cross-domain performance with 100%accuracy and 100%F1-macro.Thus,AM-CNN has the potential to be an accurate and stable fault diagnosis tool.展开更多
To address the problems of lack of high-frequency information and texture details and unstable training in superresolution generative adversarial net-works,this paper optimizes the generator and discriminator based on...To address the problems of lack of high-frequency information and texture details and unstable training in superresolution generative adversarial net-works,this paper optimizes the generator and discriminator based on the SRGAN model.First,the residual dense block is used as the basic structural unit of the gen-erator to improve the network’s feature extraction capability.Second,enhanced lightweight coordinate attention is incorporated to help the network more precisely concentrate on high-frequency location information,thereby allowing the gener-ator to produce more realistic image reconstruction results.Then,we propose a symmetric and efficient pyramidal segmentation attention discriminator network in which the attention mechanism is capable of derivingfiner-grained multiscale spatial information and creating long-term dependencies between multiscale chan-nel attentions,thus enhancing the discriminative ability of the network.Finally,a Charbonnier loss function and a gradient variance loss function with improved robustness are used to better realize the image’s texture structure and enhance the model’s stability.Thefindings from the experiments reveal that the reconstructed image quality enhances the average peak signal-to-noise ratio(PSNR)by 1.59 dB and the structural similarity index(SSIM)by 0.045 when compared to SRGAN on the three test sets.Compared with the state-of-the-art methods,the reconstructed images have a clearer texture structure,richer high-frequency details,and better visual effects.展开更多
The superresolution(SR)method based on generative adversarial networks(GANs)cannot adequately capture enough diversity from training data,resulting in misalignment between input low resolution(LR)images and output hig...The superresolution(SR)method based on generative adversarial networks(GANs)cannot adequately capture enough diversity from training data,resulting in misalignment between input low resolution(LR)images and output high resolution(HR)images.GAN training has difficulty converging.Based on this,an advanced GAN-based image SR reconstructionmethod is presented.First,the dense connection residual block and attention mechanism are integrated into the GAN generator to improve high-frequency feature extraction.Meanwhile,an added discriminator is added into the GAN discriminant network,which forms a dual discriminator to ensure that the process of training is stable.Second,the more robust Charbonnier loss is used instead of the mean square error(MSE)loss to compare similarities between the obtained image and actual image,and the total variation(TV)loss is employed to smooth the training results.Finally,the experimental results indicate that global structures can be better reconstructed using the method of this paper and texture details of images compared with other SOTA methods.The peak signal-to-noise ratio(PSNR)values by the method of this paper are improved by an average of 2.24 dB,and the structural similarity index measure(SSIM)values are improved by an average of 0.07.展开更多
基金This was supported partially by Sichuan Science and Technology Program under Grants 2019YJ0356,21ZDYF2484,21GJHZ0061Scientific Research Foundation of Education Department of Sichuan Province under Grant 18ZB0117.
文摘The main task of magnetic resonance imaging (MRI) automatic brain tumor segmentation is to automaticallysegment the brain tumor edema, peritumoral edema, endoscopic core, enhancing tumor core and nonenhancingtumor core from 3D MR images. Because the location, size, shape and intensity of brain tumors vary greatly, itis very difficult to segment these brain tumor regions automatically. In this paper, by combining the advantagesof DenseNet and ResNet, we proposed a new 3D U-Net with dense encoder blocks and residual decoder blocks.We used dense blocks in the encoder part and residual blocks in the decoder part. The number of output featuremaps increases with the network layers in contracting path of encoder, which is consistent with the characteristicsof dense blocks. Using dense blocks can decrease the number of network parameters, deepen network layers,strengthen feature propagation, alleviate vanishing-gradient and enlarge receptive fields. The residual blockswere used in the decoder to replace the convolution neural block of original U-Net, which made the networkperformance better. Our proposed approach was trained and validated on the BraTS2019 training and validationdata set. We obtained dice scores of 0.901, 0.815 and 0.766 for whole tumor, tumor core and enhancing tumorcore respectively on the BraTS2019 validation data set. Our method has the better performance than the original3D U-Net. The results of our experiment demonstrate that compared with some state-of-the-art methods, ourapproach is a competitive automatic brain tumor segmentation method.
基金funded by National Nature Science Foundation of China,grant number 61302188。
文摘In the field of image denoising, deep learning technology holds a dominance. However, the current network model tends to lose fine-grained information with the depth of the network. To address this issue, this paper proposes a Multi-scale Attention Dilated Residual Image Denoising Network(MADRNet) based on skip connection, which consists of Dense Interval Transmission Block(DTB), Sparse Residual Block(SRB), Dilated Residual Attention Reconstruction Block(DRAB) and Noise Extraction Block(NEB). The DTB enhances the classical dense layer by reducing information redundancy and extracting more accurate feature information. Meanwhile, SRB improves feature information exchange and model generalization through the use of sparse mechanism and skip connection strategy with different expansion factors. The NEB is primarily responsible for extracting and estimating noise. Its output, together with that of the sparse residual module, acts on the DRAB to effectively prevent loss of shallow feature information and improve denoising effect. Furthermore, the DRAB integrates an dilated residual block into an attention mechanism to extract hidden noise information while using residual learning technology to reconstruct clear images. We respectively examined the performance of MADRNet in gray image denoising, color image denoising and real image denoising. The experiment results demonstrate that proposed network outperforms some excellent image denoising network in terms of peak signal-to-noise ratio, structural similarity index measurement and denoising time. The proposed network effectively addresses issues associated with the loss of detail information.
基金This work is supported by the National Key Research and Development Program of China under Grant 2020YFC2004003 and Grant 2020YFC2004002the National Nature Science Foundation of China(NSFC)under Grant No.61571106。
文摘Generative adversarial networks(GANs)are paid more attention to dealing with the end-to-end speech enhancement in recent years.Various GANbased enhancement methods are presented to improve the quality of reconstructed speech.However,the performance of these GAN-based methods is worse than those of masking-based methods.To tackle this problem,we propose speech enhancement method with a residual dense generative adversarial network(RDGAN)contributing to map the log-power spectrum(LPS)of degraded speech to the clean one.In detail,a residual dense block(RDB)architecture is designed to better estimate the LPS of clean speech,which can extract rich local features of LPS through densely connected convolution layers.Meanwhile,sequential RDB connections are incorporated on various scales of LPS.It significantly increases the feature learning flexibility and robustness in the time-frequency domain.Simulations show that the proposed method achieves attractive speech enhancement performance in various acoustic environments.Specifically,in the untrained acoustic test with limited priors,e.g.,unmatched signal-to-noise ratio(SNR)and unmatched noise category,RDGAN can still outperform the existing GAN-based methods and masking-based method in the measures of PESQ and other evaluation indexes.It indicates that our method is more generalized in untrained conditions.
基金supported by the National Key Research and Development Program of China under Grant 2020YFC2004003 and Grant 2020YFC2004002the National Nature Science Foundation of China(NSFC)under Grant No.61571106.
文摘Masking-based and spectrum mapping-based methods are the two main algorithms of speech enhancement with deep neural network(DNN).But the mapping-based methods only utilizes the phase of noisy speech,which limits the upper bound of speech enhancement performance.Maskingbased methods need to accurately estimate the masking which is still the key problem.Combining the advantages of above two types of methods,this paper proposes the speech enhancement algorithm MM-RDN(maskingmapping residual dense network)based on masking-mapping(MM)and residual dense network(RDN).Using the logarithmic power spectrogram(LPS)of consecutive frames,MM estimates the ideal ratio masking(IRM)matrix of consecutive frames.RDN can make full use of feature maps of all layers.Meanwhile,using the global residual learning to combine the shallow features and deep features,RDN obtains the global dense features from the LPS,thereby improves estimated accuracy of the IRM matrix.Simulations show that the proposed method achieves attractive speech enhancement performance in various acoustic environments.Specifically,in the untrained acoustic test with limited priors,e.g.,unmatched signal-to-noise ratio(SNR)and unmatched noise category,MM-RDN can still outperform the existing convolutional recurrent network(CRN)method in themeasures of perceptual evaluation of speech quality(PESQ)and other evaluation indexes.It indicates that the proposed algorithm is more generalized in untrained conditions.
文摘In this study,an underwater image enhancement method based on multi-scale adversarial network was proposed to solve the problem of detail blur and color distortion in underwater images.Firstly,the local features of each layer were enhanced into the global features by the proposed residual dense block,which ensured that the generated images retain more details.Secondly,a multi-scale structure was adopted to extract multi-scale semantic features of the original images.Finally,the features obtained from the dual channels were fused by an adaptive fusion module to further optimize the features.The discriminant network adopted the structure of the Markov discriminator.In addition,by constructing mean square error,structural similarity,and perceived color loss function,the generated image is consistent with the reference image in structure,color,and content.The experimental results showed that the enhanced underwater image deblurring effect of the proposed algorithm was good and the problem of underwater image color bias was effectively improved.In both subjective and objective evaluation indexes,the experimental results of the proposed algorithm are better than those of the comparison algorithm.
基金supported by the National Natural Science Foundation of China(No.42072169)。
文摘Deep learning technologies are increasingly used in the fi eld of geophysics,and a variety of algorithms based on shallow convolutional neural networks are more widely used in fault recognition,but these methods are usually not able to accurately identify complex faults.In this study,using the advantage of deep residual networks to capture strong learning features,we introduce residual blocks to replace all convolutional layers of the three-dimensional(3D)UNet to build a new 3D Res-UNet and select appropriate parameters through experiments to train a large amount of synthesized seismic data.After the training is completed,we introduce the mechanism of knowledge distillation.First,we treat the 3D Res-UNet as a teacher network and then train the 3D Res-UNet as a student network;in this process,the teacher network is in evaluation mode.Finally,we calculate the mixed loss function by combining the teacher model and student network to learn more fault information,improve the performance of the network,and optimize the fault recognition eff ect.The quantitative evaluation result of the synthetic model test proves that the 3D Res-UNet can considerably improve the accuracy of fault recognition from 0.956 to 0.993 after knowledge distillation,and the eff ectiveness and feasibility of our method can be verifi ed based on the application of actual seismic data.
文摘Building extraction from high resolution remote sensing image is a key technology of digital city construction[14].In order to solve the problems of low efficiency and low precision of traditional remote sensing image segmentation,an improved U-Net network structure is adopted in this paper.Firstly,in order to extract efficient building characteristic information,FPN structure was introduced to improve the ability of integrating multi-scale information in U-Net model;Secondly,to solve the problem that feature information weakens with the deepening of network depth,an efficient residual block network is introduced;Finally,In order to better distinguish the target area and background area in the image and improve the precision of building target edge detection,the cross entropy loss and Dice loss were linearly combined and weighted.Experimental results show that the algorithm can improve the image segmentation effect and improve the image accuracy by 18%.
基金supported by the following funds:National Key R&D Program of China(2018YFF01010100)National natural science foundation of China(61672064)+1 种基金Beijing natural science foundation project(4172001)Advanced information network Beijing laboratory(PXM2019_014204_500029).
文摘In recent years,the problem of“Impolite Pedestrian”in front of the zebra crossing has aroused widespread concern from all walks of life.The traffic sector’s governance measures have become more serious.The traditional way of governance is on-site law enforcement,which requires a lot of manpower and material resources and is low efficiency.An enhanced YOLOv3-tiny model is proposed for pedestrians and vehicle detection in traffic monitoring.By modifying the backbone network structure of YOLOv3-tiny model,introducing deep detachable convolution operation,and designing the basic residual block unit of the network,the feature extraction ability of the backbone network is enhanced.The improved model is trained on the VOC2007+VOC2012 training set,and the trained model is tested for performance on the test data set.The experimental results show that:the mean Average Precision(mAP)increased from 0.672 to 0.732,increasing the measurement accuracy by 9%.The Intersection over Union(IoU)increased from 0.783 to 0.855,increasing the coverage accuracy by 7.2%.The enhanced YOLOv3-tiny model has higher measurement accuracy than the original model.Applying this model to the 1080P traffic video on the NVIDIA RTX 2080,the detection speed is 150 FPS,which can fully achieve real-time detection.Through the analysis of pedestrians and vehicle coordinates,it is judged whether or not illegal acts occur.For illegal vehicles,save three pictures as the basis for law enforcement,which forms an important supplement to off-site law enforcement.
基金Supported by the Ph.D. Research Startup Project of Minnan Normal University(KJ2021020)the National Natural Science Foundation of China(12090020 and 12090025)Zhejiang Provincial Natural Science Foundation of China(LSD19H180005)。
文摘Accurate pancreas segmentation is critical for the diagnosis and management of diseases of the pancreas. It is challenging to precisely delineate pancreas due to the highly variations in volume, shape and location. In recent years, coarse-to-fine methods have been widely used to alleviate class imbalance issue and improve pancreas segmentation accuracy. However,cascaded methods could be computationally intensive and the refined results are significantly dependent on the performance of its coarse segmentation results. To balance the segmentation accuracy and computational efficiency, we propose a Discriminative Feature Attention Network for pancreas segmentation, to effectively highlight pancreas features and improve segmentation accuracy without explicit pancreas location. The final segmentation is obtained by applying a simple yet effective post-processing step. Two experiments on both public NIH pancreas CT dataset and abdominal BTCV multi-organ dataset are individually conducted to show the effectiveness of our method for 2 D pancreas segmentation. We obtained average Dice Similarity Coefficient(DSC) of 82.82±6.09%, average Jaccard Index(JI) of 71.13± 8.30% and average Symmetric Average Surface Distance(ASD) of 1.69 ± 0.83 mm on the NIH dataset. Compared to the existing deep learning-based pancreas segmentation methods, our experimental results achieve the best average DSC and JI value.
基金The authors received no specific funding for this study.
文摘Image translation plays a significant role in realistic image synthesis,entertainment tasks such as editing and colorization,and security including personal identification.In Edge GAN,the major contribution is attribute guided vector that enables high visual quality content generation.This research study proposes automatic face image realism from freehand sketches based on Edge GAN.We propose a density variant image synthesis model,allowing the input sketch to encompass face features with minute details.The density level is projected into non-latent space,having a linear controlled function parameter.This assists the user to appropriately devise the variant densities of facial sketches and image synthesis.Composite data set of Large Scale CelebFaces Attributes(ClebA),Labelled Faces in theWild(LFWH),Chinese University of Hong Kong(CHUK),and self-generated Asian images are used to evaluate the proposed approach.The solution is validated to have the capability for generating realistic face images through quantitative and qualitative results and human evaluation.
文摘Many networks are designed to stack a large number of residual blocks,deepen the network and improve network performance through short residual connec-tion,long residual connection,and dense connection.However,without consider-ing different contributions of different depth features to the network,these de-signs have the problem of evaluating the importance of different depth features.To solve this problem,this paper proposes an adaptive densely residual net-work(ADRNet)for the single image super resolution.ADRN realizes the evalua-tion of distributions of different depth features and learns more representative features.An adaptive densely residual block(ADRB)was designed,combining 3 residual blocks(RB)and dense connection was added.It learned the attention score of each dense connection through adaptive dense connections,and the at-tention score reflected the importance of the features of each RB.To further en-hance the performance of ADRB,a multi-direction attention block(MDAB)was introduced to obtain multidirectional context information.Through comparative experiments,it is proved that theproposed ADRNet is superior to the existing methods.Through ablation experiments,it is proved that evaluating features of different depths helps to improve network performance.
基金supported by the National Key R&D Program of China(Grant No.2020YFB1709604)the State Key Laboratory of Mechanical System and Vibration(Grant No.MSVZD202103)+1 种基金the Shanghai Municipal Science and Technology Major Project(Grant No.2021SHZDZX0102)。
文摘Recently,with the urgent demand for data-driven approaches in practical industrial scenarios,the deep learning diagnosis model in noise environments has attracted increasing attention.However,the existing research has two limitations:(1)the complex and changeable environmental noise,which cannot ensure the high-performance diagnosis of the model in different noise domains and(2)the possibility of multiple faults occurring simultaneously,which brings challenges to the model diagnosis.This paper presents a novel anti-noise multi-scale convolutional neural network(AM-CNN)for solving the issue of compound fault diagnosis under different intensity noises.First,we propose a residual pre-processing block according to the principle of noise superposition to process the input information and present the residual loss to construct a new loss function.Additionally,considering the strong coupling of input information,we design a multi-scale convolution block to realize multi-scale feature extraction for enhancing the proposed model’s robustness and effectiveness.Finally,a multi-label classifier is utilized to simultaneously distinguish multiple bearing faults.The proposed AM-CNN is verified under our collected compound fault dataset.On average,AM-CNN improves 39.93%accuracy and 25.84%F1-macro under the no-noise working condition and 45.67%accuracy and 27.72%F1-macro under different intensity noise working conditions compared with the existing methods.Furthermore,the experimental results show that AM-CNN can achieve good cross-domain performance with 100%accuracy and 100%F1-macro.Thus,AM-CNN has the potential to be an accurate and stable fault diagnosis tool.
基金This work was supported in part by the Basic Scientific Research Project of Liaoning Provincial Department of Education under Grant Nos.LJKQZ2021152 and LJ2020JCL007in part by the National Science Foundation of China(NSFC)under Grant No.61602226in part by the PhD Startup Foundation of Liaoning Technical University of China under Grant Nos.18-1021.
文摘To address the problems of lack of high-frequency information and texture details and unstable training in superresolution generative adversarial net-works,this paper optimizes the generator and discriminator based on the SRGAN model.First,the residual dense block is used as the basic structural unit of the gen-erator to improve the network’s feature extraction capability.Second,enhanced lightweight coordinate attention is incorporated to help the network more precisely concentrate on high-frequency location information,thereby allowing the gener-ator to produce more realistic image reconstruction results.Then,we propose a symmetric and efficient pyramidal segmentation attention discriminator network in which the attention mechanism is capable of derivingfiner-grained multiscale spatial information and creating long-term dependencies between multiscale chan-nel attentions,thus enhancing the discriminative ability of the network.Finally,a Charbonnier loss function and a gradient variance loss function with improved robustness are used to better realize the image’s texture structure and enhance the model’s stability.Thefindings from the experiments reveal that the reconstructed image quality enhances the average peak signal-to-noise ratio(PSNR)by 1.59 dB and the structural similarity index(SSIM)by 0.045 when compared to SRGAN on the three test sets.Compared with the state-of-the-art methods,the reconstructed images have a clearer texture structure,richer high-frequency details,and better visual effects.
基金supported in part by the Basic Scientific Research Project of Liaoning Provincial Department of Education under Grant No.LJKQZ2021152in part by the National Science Foundation of China (NSFC)under Grant No.61602226in part by the PhD Startup Foundation of Liaoning Technical University of China under Grant No.18-1021.
文摘The superresolution(SR)method based on generative adversarial networks(GANs)cannot adequately capture enough diversity from training data,resulting in misalignment between input low resolution(LR)images and output high resolution(HR)images.GAN training has difficulty converging.Based on this,an advanced GAN-based image SR reconstructionmethod is presented.First,the dense connection residual block and attention mechanism are integrated into the GAN generator to improve high-frequency feature extraction.Meanwhile,an added discriminator is added into the GAN discriminant network,which forms a dual discriminator to ensure that the process of training is stable.Second,the more robust Charbonnier loss is used instead of the mean square error(MSE)loss to compare similarities between the obtained image and actual image,and the total variation(TV)loss is employed to smooth the training results.Finally,the experimental results indicate that global structures can be better reconstructed using the method of this paper and texture details of images compared with other SOTA methods.The peak signal-to-noise ratio(PSNR)values by the method of this paper are improved by an average of 2.24 dB,and the structural similarity index measure(SSIM)values are improved by an average of 0.07.