Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capaci...There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capacitor components showa continuous and stable charging and discharging state,a hierarchical time-sharing configuration algorithm of distributed energy storage cloud group end region on the power grid side based on multi-scale and multi feature convolution neural network is proposed.Firstly,a voltage stability analysis model based onmulti-scale and multi feature convolution neural network is constructed,and the multi-scale and multi feature convolution neural network is optimized based on Self-OrganizingMaps(SOM)algorithm to analyze the voltage stability of the cloud group end region of distributed energy storage on the grid side under the framework of credibility.According to the optimal scheduling objectives and network size,the distributed robust optimal configuration control model is solved under the framework of coordinated optimal scheduling at multiple time scales;Finally,the time series characteristics of regional power grid load and distributed generation are analyzed.According to the regional hierarchical time-sharing configuration model of“cloud”,“group”and“end”layer,the grid side distributed energy storage cloud group end regional hierarchical time-sharing configuration algorithm is realized.The experimental results show that after applying this algorithm,the best grid side distributed energy storage configuration scheme can be determined,and the stability of grid side distributed energy storage cloud group end region layered timesharing configuration can be improved.展开更多
Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale regi...Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.展开更多
As an integrated application of modern information technologies and artificial intelligence,Prognostic and Health Management(PHM)is important for machine health monitoring.Prediction of tool wear is one of the symboli...As an integrated application of modern information technologies and artificial intelligence,Prognostic and Health Management(PHM)is important for machine health monitoring.Prediction of tool wear is one of the symbolic applications of PHM technology in modern manufacturing systems and industry.In this paper,a multi-scale Convolutional Gated Recurrent Unit network(MCGRU)is proposed to address raw sensory data for tool wear prediction.At the bottom of MCGRU,six parallel and independent branches with different kernel sizes are designed to form a multi-scale convolutional neural network,which augments the adaptability to features of different time scales.These features of different scales extracted from raw data are then fed into a Deep Gated Recurrent Unit network to capture long-term dependencies and learn significant representations.At the top of the MCGRU,a fully connected layer and a regression layer are built for cutting tool wear prediction.Two case studies are performed to verify the capability and effectiveness of the proposed MCGRU network and results show that MCGRU outperforms several state-of-the-art baseline models.展开更多
Pedestrian attribute classification from a pedestrian image captured in surveillance scenarios is challenging due to diverse clothing appearances,varied poses and different camera views. A multiscale and multi-label c...Pedestrian attribute classification from a pedestrian image captured in surveillance scenarios is challenging due to diverse clothing appearances,varied poses and different camera views. A multiscale and multi-label convolutional neural network( MSMLCNN) is proposed to predict multiple pedestrian attributes simultaneously. The pedestrian attribute classification problem is firstly transformed into a multi-label problem including multiple binary attributes needed to be classified. Then,the multi-label problem is solved by fully connecting all binary attributes to multi-scale features with logistic regression functions. Moreover,the multi-scale features are obtained by concatenating those featured maps produced from multiple pooling layers of the MSMLCNN at different scales. Extensive experiment results show that the proposed MSMLCNN outperforms state-of-the-art pedestrian attribute classification methods with a large margin.展开更多
Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.Th...Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.展开更多
The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods ha...The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods have achieved significant improvements in image super-resolution(SR),current CNNbased techniques mainly contain massive parameters and a high computational complexity,limiting their practical applications.In this paper,we present a fast and lightweight framework,named weighted multi-scale residual network(WMRN),for a better tradeoff between SR performance and computational efficiency.With the modified residual structure,depthwise separable convolutions(DS Convs)are employed to improve convolutional operations’efficiency.Furthermore,several weighted multi-scale residual blocks(WMRBs)are stacked to enhance the multi-scale representation capability.In the reconstruction subnetwork,a group of Conv layers are introduced to filter feature maps to reconstruct the final high-quality image.Extensive experiments were conducted to evaluate the proposed model,and the comparative results with several state-of-the-art algorithms demonstrate the effectiveness of WMRN.展开更多
The background pattern of patterned fabrics is complex,which has a great interference in the extraction of defect features.Traditional machine vision algorithms rely on artificially designed features,which are greatly...The background pattern of patterned fabrics is complex,which has a great interference in the extraction of defect features.Traditional machine vision algorithms rely on artificially designed features,which are greatly affected by background patterns and are difficult to effectively extract flaw features.Therefore,a convolutional neural network(CNN)with automatic feature extraction is proposed.On the basis of the two-stage detection model Faster R-CNN,Resnet-50 is used as the backbone network,and the problem of flaws with extreme aspect ratio is solved by improving the initialization algorithm of the prior frame aspect ratio,and the improved multi-scale model is designed to improve detection of small defects.The cascade R-CNN is introduced to improve the accuracy of defect detection,and the online hard example mining(OHEM)algorithm is used to strengthen the learning of hard samples to reduce the interference of complex backgrounds on the defect detection of patterned fabrics,and construct the focal loss as a loss function to reduce the impact of sample imbalance.In order to verify the effectiveness of the improved algorithm,a defect detection comparison experiment was set up.The experimental results show that the accuracy of the defect detection algorithm of patterned fabrics in this paper can reach 95.7%,and it can accurately locate the defect location and meet the actual needs of the factory.展开更多
Cardiomyopathy is one of the most serious public health threats.The precise structural and functional cardiac measurement is an essential step for clinical diagnosis and follow-up treatment planning.Cardiologists are ...Cardiomyopathy is one of the most serious public health threats.The precise structural and functional cardiac measurement is an essential step for clinical diagnosis and follow-up treatment planning.Cardiologists are often required to draw endocardial and epicardial contours of the left ventricle(LV)manually in routine clinical diagnosis or treatment planning period.This task is time-consuming and error-prone.Therefore,it is necessary to develop a fully automated end-to-end semantic segmentation method on cardiac magnetic resonance(CMR)imaging datasets.However,due to the low image quality and the deformation caused by heartbeat,there is no effective tool for fully automated end-to-end cardiac segmentation task.In this work,we propose a multi-scale segmentation network(MSSN)for left ventricle segmentation.It can effectively learn myocardium and blood pool structure representations from 2D short-axis CMR image slices in a multi-scale way.Specifically,our method employs both parallel and serial of dilated convolution layers with different dilation rates to capture multi-scale semantic features.Moreover,we design graduated up-sampling layers with subpixel layers as the decoder to reconstruct lost spatial information and produce accurate segmentation masks.We validated our method using 164 T1 Mapping CMR images and showed that it outperforms the advanced convolutional neural network(CNN)models.In validation metrics,we archived the Dice Similarity Coefficient(DSC)metric of 78.96%.展开更多
Various diseases seriously affect the quality and yield of tomatoes. Fast and accurate identification of disease types is of great significance for the development of smart agriculture. Many Convolution Neural Network...Various diseases seriously affect the quality and yield of tomatoes. Fast and accurate identification of disease types is of great significance for the development of smart agriculture. Many Convolution Neural Network (CNN) models have been applied to the identification of tomato leaf diseases and achieved good results. However, some of these are executed at the cost of large calculation time and huge storage space. This study proposed a lightweight CNN model named MFRCNN, which is established by the multi-scale and feature reuse structure rather than simply stacking convolution layer by layer. To examine the model performances, two types of tomato leaf disease datasets were collected. One is the laboratory-based dataset, including one healthy and nine diseases, and the other is the field-based dataset, including five kinds of diseases. Afterward, the proposed MFRCNN and some popular CNN models (AlexNet, SqueezeNet, VGG16, ResNet18, and GoogLeNet) were tested on the two datasets. The results showed that compared to traditional models, the MFRCNN achieved the optimal performance, with an accuracy of 99.01% and 98.75% in laboratory and field datasets, respectively. The MFRCNN not only had the highest accuracy but also had relatively less computing time and few training parameters. Especially in terms of storage space, the MFRCNN model only needs 2.7 MB of space. Therefore, this work provides a novel solution for plant disease diagnosis, which is of great importance for the development of plant disease diagnosis systems on low-performance terminals.展开更多
目的通过机器学习分析“舌边白涎”舌象特性,对舌象进行局部特征识别研究,探讨卷积神经网络算法在舌象识别应用中的性能。方法使用Python进行图像预处理,搭建用于舌象识别的视觉几何组16层(visual geometry group 16,VGG16)卷积神经网...目的通过机器学习分析“舌边白涎”舌象特性,对舌象进行局部特征识别研究,探讨卷积神经网络算法在舌象识别应用中的性能。方法使用Python进行图像预处理,搭建用于舌象识别的视觉几何组16层(visual geometry group 16,VGG16)卷积神经网络模型,分析其对“舌边白涎”舌象鉴别分析的效果,并结合热力图分析“舌边白涎”典型舌象表现。结果基于PyTorch框架,进行卷积神经网络的舌象鉴别研究,VGG16及残差网络50层(residual network 50,ResNet50)模型验证准确率均较高,达到80%以上,且ResNet50模型优于VGG16模型,可为舌象识别提供一定参考。基于加权梯度类激活映射(gradient-weighted class activation mapping,Grad-CAM)技术,通过舌苔舌色差异分布的网络可视化,有助于直观进行模型评估分析。结论基于卷积神经网络模型对舌象数据库进行分析,实现“舌边白涎”舌象识别,有助于临床诊疗的客观化辅助分析,为舌诊智能化发展提供一定借鉴。展开更多
为了解决寻常型银屑病在样本分布不平衡的数据中可能会导致的深度学习模型诊断效果下降等问题,通过结合改进模糊KMeans聚类算法对高聚类复杂度数据的处理能力以及Visual Geometry Group 13(VGG13)深度卷积神经网络模型的预测能力,提出...为了解决寻常型银屑病在样本分布不平衡的数据中可能会导致的深度学习模型诊断效果下降等问题,通过结合改进模糊KMeans聚类算法对高聚类复杂度数据的处理能力以及Visual Geometry Group 13(VGG13)深度卷积神经网络模型的预测能力,提出一种基于改进模糊KMeans聚类算法的VGG13深度卷积神经网络(VGG13-KMeans)模型,并将其应用于寻常型银屑病的诊断任务中。实验结果表明,相较于VGG13以及ResNet18两种方法,本文方法更适用于对银屑病特征的识别。展开更多
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金supported by State Grid Corporation Limited Science and Technology Project Funding(Contract No.SGCQSQ00YJJS2200380).
文摘There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capacitor components showa continuous and stable charging and discharging state,a hierarchical time-sharing configuration algorithm of distributed energy storage cloud group end region on the power grid side based on multi-scale and multi feature convolution neural network is proposed.Firstly,a voltage stability analysis model based onmulti-scale and multi feature convolution neural network is constructed,and the multi-scale and multi feature convolution neural network is optimized based on Self-OrganizingMaps(SOM)algorithm to analyze the voltage stability of the cloud group end region of distributed energy storage on the grid side under the framework of credibility.According to the optimal scheduling objectives and network size,the distributed robust optimal configuration control model is solved under the framework of coordinated optimal scheduling at multiple time scales;Finally,the time series characteristics of regional power grid load and distributed generation are analyzed.According to the regional hierarchical time-sharing configuration model of“cloud”,“group”and“end”layer,the grid side distributed energy storage cloud group end regional hierarchical time-sharing configuration algorithm is realized.The experimental results show that after applying this algorithm,the best grid side distributed energy storage configuration scheme can be determined,and the stability of grid side distributed energy storage cloud group end region layered timesharing configuration can be improved.
基金Supported by the National Natural Science Foundation of China(61903336,61976190)the Natural Science Foundation of Zhejiang Province(LY21F030015)。
文摘Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.
基金Supported in part by Natural Science Foundation of China(Grant Nos.51835009,51705398)Shaanxi Province 2020 Natural Science Basic Research Plan(Grant No.2020JQ-042)Aeronautical Science Foundation(Grant No.2019ZB070001).
文摘As an integrated application of modern information technologies and artificial intelligence,Prognostic and Health Management(PHM)is important for machine health monitoring.Prediction of tool wear is one of the symbolic applications of PHM technology in modern manufacturing systems and industry.In this paper,a multi-scale Convolutional Gated Recurrent Unit network(MCGRU)is proposed to address raw sensory data for tool wear prediction.At the bottom of MCGRU,six parallel and independent branches with different kernel sizes are designed to form a multi-scale convolutional neural network,which augments the adaptability to features of different time scales.These features of different scales extracted from raw data are then fed into a Deep Gated Recurrent Unit network to capture long-term dependencies and learn significant representations.At the top of the MCGRU,a fully connected layer and a regression layer are built for cutting tool wear prediction.Two case studies are performed to verify the capability and effectiveness of the proposed MCGRU network and results show that MCGRU outperforms several state-of-the-art baseline models.
基金Supported by the National Natural Science Foundation of China(No.61602191,61672521,61375037,61473291,61572501,61572536,61502491,61372107,61401167)the Natural Science Foundation of Fujian Province(No.2016J01308)+3 种基金the Scientific and Technology Funds of Quanzhou(No.2015Z114)the Scientific and Technology Funds of Xiamen(No.3502Z20173045)the Promotion Program for Young and Middle aged Teacher in Science and Technology Research of Huaqiao University(No.ZQN-PY418,ZQN-YX403)the Scientific Research Funds of Huaqiao University(No.16BS108)
文摘Pedestrian attribute classification from a pedestrian image captured in surveillance scenarios is challenging due to diverse clothing appearances,varied poses and different camera views. A multiscale and multi-label convolutional neural network( MSMLCNN) is proposed to predict multiple pedestrian attributes simultaneously. The pedestrian attribute classification problem is firstly transformed into a multi-label problem including multiple binary attributes needed to be classified. Then,the multi-label problem is solved by fully connecting all binary attributes to multi-scale features with logistic regression functions. Moreover,the multi-scale features are obtained by concatenating those featured maps produced from multiple pooling layers of the MSMLCNN at different scales. Extensive experiment results show that the proposed MSMLCNN outperforms state-of-the-art pedestrian attribute classification methods with a large margin.
文摘Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.
基金the National Natural Science Foundation of China(61772149,61866009,61762028,U1701267,61702169)Guangxi Science and Technology Project(2019GXNSFFA245014,ZY20198016,AD18281079,AD18216004)+1 种基金the Natural Science Foundation of Hunan Province(2020JJ3014)Guangxi Colleges and Universities Key Laboratory of Intelligent Processing of Computer Images and Graphics(GIIP202001).
文摘The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods have achieved significant improvements in image super-resolution(SR),current CNNbased techniques mainly contain massive parameters and a high computational complexity,limiting their practical applications.In this paper,we present a fast and lightweight framework,named weighted multi-scale residual network(WMRN),for a better tradeoff between SR performance and computational efficiency.With the modified residual structure,depthwise separable convolutions(DS Convs)are employed to improve convolutional operations’efficiency.Furthermore,several weighted multi-scale residual blocks(WMRBs)are stacked to enhance the multi-scale representation capability.In the reconstruction subnetwork,a group of Conv layers are introduced to filter feature maps to reconstruct the final high-quality image.Extensive experiments were conducted to evaluate the proposed model,and the comparative results with several state-of-the-art algorithms demonstrate the effectiveness of WMRN.
基金National Key Research and Development Project,China(No.2018YFB1308800)。
文摘The background pattern of patterned fabrics is complex,which has a great interference in the extraction of defect features.Traditional machine vision algorithms rely on artificially designed features,which are greatly affected by background patterns and are difficult to effectively extract flaw features.Therefore,a convolutional neural network(CNN)with automatic feature extraction is proposed.On the basis of the two-stage detection model Faster R-CNN,Resnet-50 is used as the backbone network,and the problem of flaws with extreme aspect ratio is solved by improving the initialization algorithm of the prior frame aspect ratio,and the improved multi-scale model is designed to improve detection of small defects.The cascade R-CNN is introduced to improve the accuracy of defect detection,and the online hard example mining(OHEM)algorithm is used to strengthen the learning of hard samples to reduce the interference of complex backgrounds on the defect detection of patterned fabrics,and construct the focal loss as a loss function to reduce the impact of sample imbalance.In order to verify the effectiveness of the improved algorithm,a defect detection comparison experiment was set up.The experimental results show that the accuracy of the defect detection algorithm of patterned fabrics in this paper can reach 95.7%,and it can accurately locate the defect location and meet the actual needs of the factory.
基金This work was supported by the Project of Sichuan Outstanding Young Scientific and Technological Talents(19JCQN0003)the major Project of Education Department in Sichuan(17ZA0063 and 2017JQ0030)+1 种基金in part by the Natural Science Foundation for Young Scientists of CUIT(J201704)the Sichuan Science and Technology Program(2019JDRC0077).
文摘Cardiomyopathy is one of the most serious public health threats.The precise structural and functional cardiac measurement is an essential step for clinical diagnosis and follow-up treatment planning.Cardiologists are often required to draw endocardial and epicardial contours of the left ventricle(LV)manually in routine clinical diagnosis or treatment planning period.This task is time-consuming and error-prone.Therefore,it is necessary to develop a fully automated end-to-end semantic segmentation method on cardiac magnetic resonance(CMR)imaging datasets.However,due to the low image quality and the deformation caused by heartbeat,there is no effective tool for fully automated end-to-end cardiac segmentation task.In this work,we propose a multi-scale segmentation network(MSSN)for left ventricle segmentation.It can effectively learn myocardium and blood pool structure representations from 2D short-axis CMR image slices in a multi-scale way.Specifically,our method employs both parallel and serial of dilated convolution layers with different dilation rates to capture multi-scale semantic features.Moreover,we design graduated up-sampling layers with subpixel layers as the decoder to reconstruct lost spatial information and produce accurate segmentation masks.We validated our method using 164 T1 Mapping CMR images and showed that it outperforms the advanced convolutional neural network(CNN)models.In validation metrics,we archived the Dice Similarity Coefficient(DSC)metric of 78.96%.
基金supported by the China Agriculture Research System (CARS-170404)Qingyuan Science and Technology Plan (Grant No.2022KJJH063)Guangzhou Science and Technology Plan (Grant No.201903010063).
文摘Various diseases seriously affect the quality and yield of tomatoes. Fast and accurate identification of disease types is of great significance for the development of smart agriculture. Many Convolution Neural Network (CNN) models have been applied to the identification of tomato leaf diseases and achieved good results. However, some of these are executed at the cost of large calculation time and huge storage space. This study proposed a lightweight CNN model named MFRCNN, which is established by the multi-scale and feature reuse structure rather than simply stacking convolution layer by layer. To examine the model performances, two types of tomato leaf disease datasets were collected. One is the laboratory-based dataset, including one healthy and nine diseases, and the other is the field-based dataset, including five kinds of diseases. Afterward, the proposed MFRCNN and some popular CNN models (AlexNet, SqueezeNet, VGG16, ResNet18, and GoogLeNet) were tested on the two datasets. The results showed that compared to traditional models, the MFRCNN achieved the optimal performance, with an accuracy of 99.01% and 98.75% in laboratory and field datasets, respectively. The MFRCNN not only had the highest accuracy but also had relatively less computing time and few training parameters. Especially in terms of storage space, the MFRCNN model only needs 2.7 MB of space. Therefore, this work provides a novel solution for plant disease diagnosis, which is of great importance for the development of plant disease diagnosis systems on low-performance terminals.
文摘目的通过机器学习分析“舌边白涎”舌象特性,对舌象进行局部特征识别研究,探讨卷积神经网络算法在舌象识别应用中的性能。方法使用Python进行图像预处理,搭建用于舌象识别的视觉几何组16层(visual geometry group 16,VGG16)卷积神经网络模型,分析其对“舌边白涎”舌象鉴别分析的效果,并结合热力图分析“舌边白涎”典型舌象表现。结果基于PyTorch框架,进行卷积神经网络的舌象鉴别研究,VGG16及残差网络50层(residual network 50,ResNet50)模型验证准确率均较高,达到80%以上,且ResNet50模型优于VGG16模型,可为舌象识别提供一定参考。基于加权梯度类激活映射(gradient-weighted class activation mapping,Grad-CAM)技术,通过舌苔舌色差异分布的网络可视化,有助于直观进行模型评估分析。结论基于卷积神经网络模型对舌象数据库进行分析,实现“舌边白涎”舌象识别,有助于临床诊疗的客观化辅助分析,为舌诊智能化发展提供一定借鉴。
文摘为了解决寻常型银屑病在样本分布不平衡的数据中可能会导致的深度学习模型诊断效果下降等问题,通过结合改进模糊KMeans聚类算法对高聚类复杂度数据的处理能力以及Visual Geometry Group 13(VGG13)深度卷积神经网络模型的预测能力,提出一种基于改进模糊KMeans聚类算法的VGG13深度卷积神经网络(VGG13-KMeans)模型,并将其应用于寻常型银屑病的诊断任务中。实验结果表明,相较于VGG13以及ResNet18两种方法,本文方法更适用于对银屑病特征的识别。