Accurate diagnosis of apple leaf diseases is crucial for improving the quality of apple production and promoting the development of the apple industry. However, apple leaf diseases do not differ significantly from ima...Accurate diagnosis of apple leaf diseases is crucial for improving the quality of apple production and promoting the development of the apple industry. However, apple leaf diseases do not differ significantly from image texture and structural information. The difficulties in disease feature extraction in complex backgrounds slow the related research progress. To address the problems, this paper proposes an improved multi-scale inverse bottleneck residual network model based on a triplet parallel attention mechanism, which is built upon ResNet-50, while improving and combining the inception module and ResNext inverse bottleneck blocks, to recognize seven types of apple leaf(including six diseases of alternaria leaf spot, brown spot, grey spot, mosaic, rust, scab, and one healthy). First, the 3×3 convolutions in some of the residual modules are replaced by multi-scale residual convolutions, the convolution kernels of different sizes contained in each branch of the multi-scale convolution are applied to extract feature maps of different sizes, and the outputs of these branches are multi-scale fused by summing to enrich the output features of the images. Second, the global layer-wise dynamic coordinated inverse bottleneck structure is used to reduce the network feature loss. The inverse bottleneck structure makes the image information less lossy when transforming from different dimensional feature spaces. The fusion of multi-scale and layer-wise dynamic coordinated inverse bottlenecks makes the model effectively balances computational efficiency and feature representation capability, and more robust with a combination of horizontal and vertical features in the fine identification of apple leaf diseases. Finally, after each improved module, a triplet parallel attention module is integrated with cross-dimensional interactions among channels through rotations and residual transformations, which improves the parallel search efficiency of important features and the recognition rate of the network with relatively small computational costs while the dimensional dependencies are improved. To verify the validity of the model in this paper, we uniformly enhance apple leaf disease images screened from the public data sets of Plant Village, Baidu Flying Paddle, and the Internet. The final processed image count is 14,000. The ablation study, pre-processing comparison, and method comparison are conducted on the processed datasets. The experimental results demonstrate that the proposed method reaches 98.73% accuracy on the adopted datasets, which is 1.82% higher than the classical ResNet-50 model, and 0.29% better than the apple leaf disease datasets before preprocessing. It also achieves competitive results in apple leaf disease identification compared to some state-ofthe-art methods.展开更多
Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often...Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.展开更多
Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale regi...Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.展开更多
Recently,deep learning-based image inpainting methods have made great strides in reconstructing damaged regions.However,these methods often struggle to produce satisfactory results when dealing with missing images wit...Recently,deep learning-based image inpainting methods have made great strides in reconstructing damaged regions.However,these methods often struggle to produce satisfactory results when dealing with missing images with large holes,leading to distortions in the structure and blurring of textures.To address these problems,we combine the advantages of transformers and convolutions to propose an image inpainting method that incorporates edge priors and attention mechanisms.The proposed method aims to improve the results of inpainting large holes in images by enhancing the accuracy of structure restoration and the ability to recover texture details.This method divides the inpainting task into two phases:edge prediction and image inpainting.Specifically,in the edge prediction phase,a transformer architecture is designed to combine axial attention with standard self-attention.This design enhances the extraction capability of global structural features and location awareness.It also balances the complexity of self-attention operations,resulting in accurate prediction of the edge structure in the defective region.In the image inpainting phase,a multi-scale fusion attention module is introduced.This module makes full use of multi-level distant features and enhances local pixel continuity,thereby significantly improving the quality of image inpainting.To evaluate the performance of our method.comparative experiments are conducted on several datasets,including CelebA,Places2,and Facade.Quantitative experiments show that our method outperforms the other mainstream methods.Specifically,it improves Peak Signal-to-Noise Ratio(PSNR)and Structure Similarity Index Measure(SSIM)by 1.141~3.234 db and 0.083~0.235,respectively.Moreover,it reduces Learning Perceptual Image Patch Similarity(LPIPS)and Mean Absolute Error(MAE)by 0.0347~0.1753 and 0.0104~0.0402,respectively.Qualitative experiments reveal that our method excels at reconstructing images with complete structural information and clear texture details.Furthermore,our model exhibits impressive performance in terms of the number of parameters,memory cost,and testing time.展开更多
The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prosta...The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prostate segmentation,but due to the variability caused by prostate diseases,automatic segmentation of the prostate presents significant challenges.In this paper,we propose an attention-guided multi-scale feature fusion network(AGMSF-Net)to segment prostate MRI images.We propose an attention mechanism for extracting multi-scale features,and introduce a 3D transformer module to enhance global feature representation by adding it during the transition phase from encoder to decoder.In the decoder stage,a feature fusion module is proposed to obtain global context information.We evaluate our model on MRI images of the prostate acquired from a local hospital.The relative volume difference(RVD)and dice similarity coefficient(DSC)between the results of automatic prostate segmentation and ground truth were 1.21%and 93.68%,respectively.To quantitatively evaluate prostate volume on MRI,which is of significant clinical significance,we propose a unique AGMSF-Net.The essential performance evaluation and validation experiments have demonstrated the effectiveness of our method in automatic prostate segmentation.展开更多
Whole brain functional connectivity(FC)patterns obtained from resting-state functional magnetic resonance imaging(rs-fMRI)have been widely used in the diagnosis of brain disorders such as autism spectrum disorder(ASD)...Whole brain functional connectivity(FC)patterns obtained from resting-state functional magnetic resonance imaging(rs-fMRI)have been widely used in the diagnosis of brain disorders such as autism spectrum disorder(ASD).Recently,an increasing number of studies have focused on employing deep learning techniques to analyze FC patterns for brain disease classification.However,the high dimensionality of the FC features and the interpretation of deep learning results are issues that need to be addressed in the FC-based brain disease classification.In this paper,we proposed a multi-scale attention-based deep neural network(MSA-DNN)model to classify FC patterns for the ASD diagnosis.The model was implemented by adding a flexible multi-scale attention(MSA)module to the auto-encoder based backbone DNN,which can extract multi-scale features of the FC patterns and change the level of attention for different FCs by continuous learning.Our model will reinforce the weights of important FC features while suppress the unimportant FCs to ensure the sparsity of the model weights and enhance the model interpretability.We performed systematic experiments on the large multi-sites ASD dataset with both ten-fold and leaveone-site-out cross-validations.Results showed that our model outperformed classical methods in brain disease classification and revealed robust intersite prediction performance.We also localized important FC features and brain regions associated with ASD classification.Overall,our study further promotes the biomarker detection and computer-aided classification for ASD diagnosis,and the proposed MSA module is flexible and easy to implement in other classification networks.展开更多
To solve the problem of difficulty in identifying apple diseases in the natural environment and the low application rate of deep learning recognition networks,a lightweight ResNet(LW-ResNet)model for apple disease rec...To solve the problem of difficulty in identifying apple diseases in the natural environment and the low application rate of deep learning recognition networks,a lightweight ResNet(LW-ResNet)model for apple disease recognition is proposed.Based on the deep residual network(ResNet18),the multi-scale feature extraction layer is constructed by group convolution to realize the compression model and improve the extraction ability of different sizes of lesion features.By improving the identity mapping structure to reduce information loss.By introducing the efficient channel attention module(ECANet)to suppress noise from a complex background.The experimental results show that the average precision,recall and F1-score of the LW-ResNet on the test set are 97.80%,97.92%and 97.85%,respectively.The parameter memory is 2.32 MB,which is 94%less than that of ResNet18.Compared with the classic lightweight networks SqueezeNet and MobileNetV2,LW-ResNet has obvious advantages in recognition performance,speed,parameter memory requirement and time complexity.The proposed model has the advantages of low computational cost,low storage cost,strong real-time performance,high identification accuracy,and strong practicability,which can meet the needs of real-time identification task of apple leaf disease on resource-constrained devices.展开更多
基金supported in part by the General Program Hunan Provincial Natural Science Foundation of 2022,China(2022JJ31022)the Undergraduate Education Reform Project of Hunan Province,China(HNJG-20210532)the National Natural Science Foundation of China(62276276)。
文摘Accurate diagnosis of apple leaf diseases is crucial for improving the quality of apple production and promoting the development of the apple industry. However, apple leaf diseases do not differ significantly from image texture and structural information. The difficulties in disease feature extraction in complex backgrounds slow the related research progress. To address the problems, this paper proposes an improved multi-scale inverse bottleneck residual network model based on a triplet parallel attention mechanism, which is built upon ResNet-50, while improving and combining the inception module and ResNext inverse bottleneck blocks, to recognize seven types of apple leaf(including six diseases of alternaria leaf spot, brown spot, grey spot, mosaic, rust, scab, and one healthy). First, the 3×3 convolutions in some of the residual modules are replaced by multi-scale residual convolutions, the convolution kernels of different sizes contained in each branch of the multi-scale convolution are applied to extract feature maps of different sizes, and the outputs of these branches are multi-scale fused by summing to enrich the output features of the images. Second, the global layer-wise dynamic coordinated inverse bottleneck structure is used to reduce the network feature loss. The inverse bottleneck structure makes the image information less lossy when transforming from different dimensional feature spaces. The fusion of multi-scale and layer-wise dynamic coordinated inverse bottlenecks makes the model effectively balances computational efficiency and feature representation capability, and more robust with a combination of horizontal and vertical features in the fine identification of apple leaf diseases. Finally, after each improved module, a triplet parallel attention module is integrated with cross-dimensional interactions among channels through rotations and residual transformations, which improves the parallel search efficiency of important features and the recognition rate of the network with relatively small computational costs while the dimensional dependencies are improved. To verify the validity of the model in this paper, we uniformly enhance apple leaf disease images screened from the public data sets of Plant Village, Baidu Flying Paddle, and the Internet. The final processed image count is 14,000. The ablation study, pre-processing comparison, and method comparison are conducted on the processed datasets. The experimental results demonstrate that the proposed method reaches 98.73% accuracy on the adopted datasets, which is 1.82% higher than the classical ResNet-50 model, and 0.29% better than the apple leaf disease datasets before preprocessing. It also achieves competitive results in apple leaf disease identification compared to some state-ofthe-art methods.
基金This research was supported by the National Natural Science Foundation of China No.62276086the National Key R&D Program of China No.2022YFD2000100Zhejiang Provincial Natural Science Foundation of China under Grant No.LTGN23D010002.
文摘Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.
基金Supported by the National Natural Science Foundation of China(61903336,61976190)the Natural Science Foundation of Zhejiang Province(LY21F030015)。
文摘Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.
基金supported in part by the National Natural Science Foundation of China under Grant 62062061/in part by the Major Project Cultivation Fund of Xizang Minzu University under Grant 324112300447.
文摘Recently,deep learning-based image inpainting methods have made great strides in reconstructing damaged regions.However,these methods often struggle to produce satisfactory results when dealing with missing images with large holes,leading to distortions in the structure and blurring of textures.To address these problems,we combine the advantages of transformers and convolutions to propose an image inpainting method that incorporates edge priors and attention mechanisms.The proposed method aims to improve the results of inpainting large holes in images by enhancing the accuracy of structure restoration and the ability to recover texture details.This method divides the inpainting task into two phases:edge prediction and image inpainting.Specifically,in the edge prediction phase,a transformer architecture is designed to combine axial attention with standard self-attention.This design enhances the extraction capability of global structural features and location awareness.It also balances the complexity of self-attention operations,resulting in accurate prediction of the edge structure in the defective region.In the image inpainting phase,a multi-scale fusion attention module is introduced.This module makes full use of multi-level distant features and enhances local pixel continuity,thereby significantly improving the quality of image inpainting.To evaluate the performance of our method.comparative experiments are conducted on several datasets,including CelebA,Places2,and Facade.Quantitative experiments show that our method outperforms the other mainstream methods.Specifically,it improves Peak Signal-to-Noise Ratio(PSNR)and Structure Similarity Index Measure(SSIM)by 1.141~3.234 db and 0.083~0.235,respectively.Moreover,it reduces Learning Perceptual Image Patch Similarity(LPIPS)and Mean Absolute Error(MAE)by 0.0347~0.1753 and 0.0104~0.0402,respectively.Qualitative experiments reveal that our method excels at reconstructing images with complete structural information and clear texture details.Furthermore,our model exhibits impressive performance in terms of the number of parameters,memory cost,and testing time.
基金This work was supported in part by the National Natural Science Foundation of China(Grant#:82260362)in part by the National Key R&D Program of China(Grant#:2021ZD0111000)+1 种基金in part by the Key R&D Project of Hainan Province(Grant#:ZDYF2021SHFZ243)in part by the Major Science and Technology Project of Haikou(Grant#:2020-009).
文摘The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prostate segmentation,but due to the variability caused by prostate diseases,automatic segmentation of the prostate presents significant challenges.In this paper,we propose an attention-guided multi-scale feature fusion network(AGMSF-Net)to segment prostate MRI images.We propose an attention mechanism for extracting multi-scale features,and introduce a 3D transformer module to enhance global feature representation by adding it during the transition phase from encoder to decoder.In the decoder stage,a feature fusion module is proposed to obtain global context information.We evaluate our model on MRI images of the prostate acquired from a local hospital.The relative volume difference(RVD)and dice similarity coefficient(DSC)between the results of automatic prostate segmentation and ground truth were 1.21%and 93.68%,respectively.To quantitatively evaluate prostate volume on MRI,which is of significant clinical significance,we propose a unique AGMSF-Net.The essential performance evaluation and validation experiments have demonstrated the effectiveness of our method in automatic prostate segmentation.
基金This work was supported by the National Natural Science Foundation of China(No.61906006).
文摘Whole brain functional connectivity(FC)patterns obtained from resting-state functional magnetic resonance imaging(rs-fMRI)have been widely used in the diagnosis of brain disorders such as autism spectrum disorder(ASD).Recently,an increasing number of studies have focused on employing deep learning techniques to analyze FC patterns for brain disease classification.However,the high dimensionality of the FC features and the interpretation of deep learning results are issues that need to be addressed in the FC-based brain disease classification.In this paper,we proposed a multi-scale attention-based deep neural network(MSA-DNN)model to classify FC patterns for the ASD diagnosis.The model was implemented by adding a flexible multi-scale attention(MSA)module to the auto-encoder based backbone DNN,which can extract multi-scale features of the FC patterns and change the level of attention for different FCs by continuous learning.Our model will reinforce the weights of important FC features while suppress the unimportant FCs to ensure the sparsity of the model weights and enhance the model interpretability.We performed systematic experiments on the large multi-sites ASD dataset with both ten-fold and leaveone-site-out cross-validations.Results showed that our model outperformed classical methods in brain disease classification and revealed robust intersite prediction performance.We also localized important FC features and brain regions associated with ASD classification.Overall,our study further promotes the biomarker detection and computer-aided classification for ASD diagnosis,and the proposed MSA module is flexible and easy to implement in other classification networks.
基金funded by the Science and Technology Development Program of Jilin Province(20190301024NY)the Precision Agriculture and Big Data Engineering Research Center of Jilin Province(2020C005).
文摘To solve the problem of difficulty in identifying apple diseases in the natural environment and the low application rate of deep learning recognition networks,a lightweight ResNet(LW-ResNet)model for apple disease recognition is proposed.Based on the deep residual network(ResNet18),the multi-scale feature extraction layer is constructed by group convolution to realize the compression model and improve the extraction ability of different sizes of lesion features.By improving the identity mapping structure to reduce information loss.By introducing the efficient channel attention module(ECANet)to suppress noise from a complex background.The experimental results show that the average precision,recall and F1-score of the LW-ResNet on the test set are 97.80%,97.92%and 97.85%,respectively.The parameter memory is 2.32 MB,which is 94%less than that of ResNet18.Compared with the classic lightweight networks SqueezeNet and MobileNetV2,LW-ResNet has obvious advantages in recognition performance,speed,parameter memory requirement and time complexity.The proposed model has the advantages of low computational cost,low storage cost,strong real-time performance,high identification accuracy,and strong practicability,which can meet the needs of real-time identification task of apple leaf disease on resource-constrained devices.