Geological discontinuity(GD)plays a pivotal role in determining the catastrophic mechanical failure of jointed rock masses.Accurate and efficient acquisition of GD networks is essential for characterizing and understa...Geological discontinuity(GD)plays a pivotal role in determining the catastrophic mechanical failure of jointed rock masses.Accurate and efficient acquisition of GD networks is essential for characterizing and understanding the progressive damage mechanisms of slopes based on monitoring image data.Inspired by recent advances in computer vision,deep learning(DL)models have been widely utilized for image-based fracture identification.The multi-scale characteristics,image resolution and annotation quality of images will cause a scale-space effect(SSE)that makes features indistinguishable from noise,directly affecting the accuracy.However,this effect has not received adequate attention.Herein,we try to address this gap by collecting slope images at various proportional scales and constructing multi-scale datasets using image processing techniques.Next,we quantify the intensity of feature signals using metrics such as peak signal-to-noise ratio(PSNR)and structural similarity(SSIM).Combining these metrics with the scale-space theory,we investigate the influence of the SSE on the differentiation of multi-scale features and the accuracy of recognition.It is found that augmenting the image's detail capacity does not always yield benefits for vision-based recognition models.In light of these observations,we propose a scale hybridization approach based on the diffusion mechanism of scale-space representation.The results show that scale hybridization strengthens the tolerance of multi-scale feature recognition under complex environmental noise interference and significantly enhances the recognition accuracy of GD.It also facilitates the objective understanding,description and analysis of the rock behavior and stability of slopes from the perspective of image data.展开更多
In recent times,an image enhancement approach,which learns the global transformation function using deep neural networks,has gained attention.However,many existing methods based on this approach have a limitation:thei...In recent times,an image enhancement approach,which learns the global transformation function using deep neural networks,has gained attention.However,many existing methods based on this approach have a limitation:their transformation functions are too simple to imitate complex colour transformations between low-quality images and manually retouched high-quality images.In order to address this limitation,a simple yet effective approach for image enhancement is proposed.The proposed algorithm based on the channel-wise intensity transformation is designed.However,this transformation is applied to the learnt embedding space instead of specific colour spaces and then return enhanced features to colours.To this end,the authors define the continuous intensity transformation(CIT)to describe the mapping between input and output intensities on the embedding space.Then,the enhancement network is developed,which produces multi-scale feature maps from input images,derives the set of transformation functions,and performs the CIT to obtain enhanced images.Extensive experiments on the MIT-Adobe 5K dataset demonstrate that the authors’approach improves the performance of conventional intensity transforms on colour space metrics.Specifically,the authors achieved a 3.8%improvement in peak signal-to-noise ratio,a 1.8%improvement in structual similarity index measure,and a 27.5%improvement in learned perceptual image patch similarity.Also,the authors’algorithm outperforms state-of-the-art alternatives on three image enhancement datasets:MIT-Adobe 5K,Low-Light,and Google HDRþ.展开更多
As a branch of quantum image processing,quantum image scaling has been widely studied.However,most of the existing quantum image scaling algorithms are based on nearest-neighbor interpolation and bilinear interpolatio...As a branch of quantum image processing,quantum image scaling has been widely studied.However,most of the existing quantum image scaling algorithms are based on nearest-neighbor interpolation and bilinear interpolation,the quantum version of bicubic interpolation has not yet been studied.In this work,we present the first quantum image scaling scheme for bicubic interpolation based on the novel enhanced quantum representation(NEQR).Our scheme can realize synchronous enlargement and reduction of the image with the size of 2^(n)×2^(n) by integral multiple.Firstly,the image is represented by NEQR and the original image coordinates are obtained through multiple CNOT modules.Then,16 neighborhood pixels are obtained by quantum operation circuits,and the corresponding weights of these pixels are calculated by quantum arithmetic modules.Finally,a quantum matrix operation,instead of a classical convolution operation,is used to realize the sum of convolution of these pixels.Through simulation experiments and complexity analysis,we demonstrate that our scheme achieves exponential speedup over the classical bicubic interpolation algorithm,and has better effect than the quantum version of bilinear interpolation.展开更多
Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have b...Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have been proposed,most of them can only address part of the practical difficulties.An oscillation is heuristically defined as a visually apparent periodic variation.However,manual visual inspection is labor-intensive and prone to missed detection.Convolutional neural networks(CNNs),inspired by animal visual systems,have been raised with powerful feature extraction capabilities.In this work,an exploration of the typical CNN models for visual oscillation detection is performed.Specifically,we tested MobileNet-V1,ShuffleNet-V2,Efficient Net-B0,and GhostNet models,and found that such a visual framework is well-suited for oscillation detection.The feasibility and validity of this framework are verified utilizing extensive numerical and industrial cases.Compared with state-of-theart oscillation detectors,the suggested framework is more straightforward and more robust to noise and mean-nonstationarity.In addition,this framework generalizes well and is capable of handling features that are not present in the training data,such as multiple oscillations and outliers.展开更多
Diagnosing various diseases such as glaucoma,age-related macular degeneration,cardiovascular conditions,and diabetic retinopathy involves segmenting retinal blood vessels.The task is particularly challenging when deal...Diagnosing various diseases such as glaucoma,age-related macular degeneration,cardiovascular conditions,and diabetic retinopathy involves segmenting retinal blood vessels.The task is particularly challenging when dealing with color fundus images due to issues like non-uniformillumination,low contrast,and variations in vessel appearance,especially in the presence of different pathologies.Furthermore,the speed of the retinal vessel segmentation system is of utmost importance.With the surge of now available big data,the speed of the algorithm becomes increasingly important,carrying almost equivalent weightage to the accuracy of the algorithm.To address these challenges,we present a novel approach for retinal vessel segmentation,leveraging efficient and robust techniques based on multiscale line detection and mathematical morphology.Our algorithm’s performance is evaluated on two publicly available datasets,namely the Digital Retinal Images for Vessel Extraction dataset(DRIVE)and the Structure Analysis of Retina(STARE)dataset.The experimental results demonstrate the effectiveness of our method,withmean accuracy values of 0.9467 forDRIVE and 0.9535 for STARE datasets,aswell as sensitivity values of 0.6952 forDRIVE and 0.6809 for STARE datasets.Notably,our algorithmexhibits competitive performance with state-of-the-art methods.Importantly,it operates at an average speed of 3.73 s per image for DRIVE and 3.75 s for STARE datasets.It is worth noting that these results were achieved using Matlab scripts containing multiple loops.This suggests that the processing time can be further reduced by replacing loops with vectorization.Thus the proposed algorithm can be deployed in real time applications.In summary,our proposed system strikes a fine balance between swift computation and accuracy that is on par with the best available methods in the field.展开更多
Person image generation aims to generate images that maintain the original human appearance in different target poses.Recent works have revealed that the critical element in achieving this task is the alignment of app...Person image generation aims to generate images that maintain the original human appearance in different target poses.Recent works have revealed that the critical element in achieving this task is the alignment of appearance domain and pose domain.Previous alignment methods,such as appearance flow warping,correspondence learning and cross attention,often encounter challenges when it comes to producing fine texture details.These approaches suffer from limitations in accurately estimating appearance flows due to the lack of global receptive field.Alternatively,they can only perform cross-domain alignment on high-level feature maps with small spatial dimensions since the computational complexity increases quadratically with larger feature sizes.In this article,the significance of multi-scale alignment,in both low-level and high-level domains,for ensuring reliable cross-domain alignment of appearance and pose is demonstrated.To this end,a novel and effective method,named Multi-scale Crossdomain Alignment(MCA)is proposed.Firstly,MCA adopts global context aggregation transformer to model multi-scale interaction between pose and appearance inputs,which employs pair-wise window-based cross attention.Furthermore,leveraging the integrated global source information for each target position,MCA applies flexible flow prediction head and point correlation to effectively conduct warping and fusing for final transformed person image generation.Our proposed MCA achieves superior performance on two popular datasets than other methods,which verifies the effectiveness of our approach.展开更多
Strong atmospheric turbulence reduces astronomical seeing,causing speckle images acquired by ground-based solar telescopes to become blurred and distorted.Severe distortion in speckle images impedes image phase deviat...Strong atmospheric turbulence reduces astronomical seeing,causing speckle images acquired by ground-based solar telescopes to become blurred and distorted.Severe distortion in speckle images impedes image phase deviation in the speckle masking reconstruction method,leading to the appearance of spurious imaging artifacts.Relying only on linear image degradation principles to reconstruct solar images is insufficient.To solve this problem,we propose the multiframe blind deconvolution combined with non-rigid alignment(MFBD-CNRA)method for solar image reconstruction.We consider image distortion caused by atmospheric turbulence and use non-rigid alignment to correct pixel-level distortion,thereby achieving nonlinear constraints to complement image intensity changes.After creating the corrected speckle image,we use the linear method to solve the wavefront phase,obtaining the target image.We verify the effectiveness of our method results,compared with others,using solar observation data from the 1 m new vacuum solar telescope(NVST).This new method successfully reconstructs high-resolution images of solar observations with a Fried parameter r0 of approximately 10 cm,and enhances images at high frequency.When r0 is approximately 5 cm,the new method is even more effective.It reconstructs the edges of solar graining and sunspots,and is greatly enhanced at mid and high frequency compared with other methods.Comparisons confirm the effectiveness of this method,with respect to both nonlinear and linear constraints in solar image reconstruction.This provides a suitable solution for image reconstruction in ground-based solar observations under strong atmospheric turbulence.展开更多
The accumulation of snow and ice on PV modules can have a detrimental impact on power generation,leading to reduced efficiency for prolonged periods.Thus,it becomes imperative to develop an intelligent system capable ...The accumulation of snow and ice on PV modules can have a detrimental impact on power generation,leading to reduced efficiency for prolonged periods.Thus,it becomes imperative to develop an intelligent system capable of accurately assessing the extent of snow and ice coverage on PV modules.To address this issue,the article proposes an innovative ice and snow recognition algorithm that effectively segments the ice and snow areas within the collected images.Furthermore,the algorithm incorporates an analysis of the morphological characteristics of ice and snow coverage on PV modules,allowing for the establishment of a residual ice and snow recognition process.This process utilizes both the external ellipse method and the pixel statistical method to refine the identification process.The effectiveness of the proposed algorithm is validated through extensive testing with isolated and continuous snow area pictures.The results demonstrate the algorithm’s accuracy and reliability in identifying and quantifying residual snow and ice on PV modules.In conclusion,this research presents a valuable method for accurately detecting and quantifying snow and ice coverage on PV modules.This breakthrough is of utmost significance for PV power plants,as it enables predictions of power generation efficiency and facilitates efficient PV maintenance during the challenging winter conditions characterized by snow and ice.By proactively managing snow and ice coverage,PV power plants can optimize energy production and minimize downtime,ensuring a sustainable and reliable renewable energy supply.展开更多
Real-time capabilities and computational efficiency are provided by parallel image processing utilizing OpenMP. However, race conditions can affect the accuracy and reliability of the outcomes. This paper highlights t...Real-time capabilities and computational efficiency are provided by parallel image processing utilizing OpenMP. However, race conditions can affect the accuracy and reliability of the outcomes. This paper highlights the importance of addressing race conditions in parallel image processing, specifically focusing on color inverse filtering using OpenMP. We considered three solutions to solve race conditions, each with distinct characteristics: #pragma omp atomic: Protects individual memory operations for fine-grained control. #pragma omp critical: Protects entire code blocks for exclusive access. #pragma omp parallel sections reduction: Employs a reduction clause for safe aggregation of values across threads. Our findings show that the produced images were unaffected by race condition. However, it becomes evident that solving the race conditions in the code makes it significantly faster, especially when it is executed on multiple cores.展开更多
In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularl...In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks.展开更多
Health care is an important part of human life and is a right for everyone. One of the most basic human rights is to receive health care whenever they need it. However, this is simply not an option for everyone due to...Health care is an important part of human life and is a right for everyone. One of the most basic human rights is to receive health care whenever they need it. However, this is simply not an option for everyone due to the social conditions in which some communities live and not everyone has access to it. This paper aims to serve as a reference point and guide for users who are interested in monitoring their health, particularly their blood analysis to be aware of their health condition in an easy way. This study introduces an algorithmic approach for extracting and analyzing Complete Blood Count (CBC) parameters from scanned images. The algorithm employs Optical Character Recognition (OCR) technology to process images containing tabular data, specifically targeting CBC parameter tables. Upon image processing, the algorithm extracts data and identifies CBC parameters and their corresponding values. It evaluates the status (High, Low, or Normal) of each parameter and subsequently presents evaluations, and any potential diagnoses. The primary objective is to automate the extraction and evaluation of CBC parameters, aiding healthcare professionals in swiftly assessing blood analysis results. The algorithmic framework aims to streamline the interpretation of CBC tests, potentially improving efficiency and accuracy in clinical diagnostics.展开更多
In intelligent perception and diagnosis of medical equipment,the visual and morphological changes in retinal vessels are closely related to the severity of cardiovascular diseases(e.g.,diabetes and hypertension).Intel...In intelligent perception and diagnosis of medical equipment,the visual and morphological changes in retinal vessels are closely related to the severity of cardiovascular diseases(e.g.,diabetes and hypertension).Intelligent auxiliary diagnosis of these diseases depends on the accuracy of the retinal vascular segmentation results.To address this challenge,we design a Dual-Branch-UNet framework,which comprises a Dual-Branch encoder structure for feature extraction based on the traditional U-Net model for medical image segmentation.To be more explicit,we utilize a novel parallel encoder made up of various convolutional modules to enhance the encoder portion of the original U-Net.Then,image features are combined at each layer to produce richer semantic data and the model’s capacity is adjusted to various input images.Meanwhile,in the lower sampling section,we give up pooling and conduct the lower sampling by convolution operation to control step size for information fusion.We also employ an attentionmodule in the decoder stage to filter the image noises so as to lessen the response of irrelevant features.Experiments are verified and compared on the DRIVE and ARIA datasets for retinal vessels segmentation.The proposed Dual-Branch-UNet has proved to be superior to other five typical state-of-the-art methods.展开更多
This study investigated the correlations between mechanical properties and mineralogy of granite using the digital image processing(DIP) and discrete element method(DEM). The results showed that the X-ray diffraction(...This study investigated the correlations between mechanical properties and mineralogy of granite using the digital image processing(DIP) and discrete element method(DEM). The results showed that the X-ray diffraction(XRD)-based DIP method effectively analyzed the mineral composition contents and spatial distributions of granite. During the particle flow code(PFC2D) model calibration phase, the numerical simulation exhibited that the uniaxial compressive strength(UCS) value, elastic modulus(E), and failure pattern of the granite specimen in the UCS test were comparable to the experiment. By establishing 351 sets of numerical models and exploring the impacts of mineral composition on the mechanical properties of granite, it indicated that there was no negative correlation between quartz and feldspar for UCS, tensile strength(σ_(t)), and E. In contrast, mica had a significant negative correlation for UCS, σ_(t), and E. The presence of quartz increased the brittleness of granite, whereas the presence of mica and feldspar increased its ductility in UCS and direct tensile strength(DTS) tests. Varying contents of major mineral compositions in granite showed minor influence on the number of cracks in both UCS and DTS tests.展开更多
A comprehensive understanding of spatial distribution and clustering patterns of gravels is of great significance for ecological restoration and monitoring.However,traditional methods for studying gravels are low-effi...A comprehensive understanding of spatial distribution and clustering patterns of gravels is of great significance for ecological restoration and monitoring.However,traditional methods for studying gravels are low-efficiency and have many errors.This study researched the spatial distribution and cluster characteristics of gravels based on digital image processing technology combined with a self-organizing map(SOM)and multivariate statistical methods in the grassland of northern Tibetan Plateau.Moreover,the correlation of morphological parameters of gravels between different cluster groups and the environmental factors affecting gravel distribution were analyzed.The results showed that the morphological characteristics of gravels in northern region(cluster C)and southern region(cluster B)of the Tibetan Plateau were similar,with a low gravel coverage,small gravel diameter,and elongated shape.These regions were mainly distributed in high mountainous areas with large topographic relief.The central region(cluster A)has high coverage of gravels with a larger diameter,mainly distributed in high-altitude plains with smaller undulation.Principal component analysis(PCA)results showed that the gravel distribution of cluster A may be mainly affected by vegetation,while those in clusters B and C could be mainly affected by topography,climate,and soil.The study confirmed that the combination of digital image processing technology and SOM could effectively analyzed the spatial distribution characteristics of gravels,providing a new mode for gravel research.展开更多
Large structures,such as bridges,highways,etc.,need to be inspected to evaluate their actual physical and functional condition,to predict future conditions,and to help decision makers allocating maintenance and rehabi...Large structures,such as bridges,highways,etc.,need to be inspected to evaluate their actual physical and functional condition,to predict future conditions,and to help decision makers allocating maintenance and rehabilitation resources.The assessment of civil infrastructure condition is carried out through information obtained by inspection and/or monitoring operations.Traditional techniques in structural health monitoring(SHM)involve visual inspection related to inspection standards that can be time-consuming data collection,expensive,labor intensive,and dangerous.To address these limitations,machine vision-based inspection procedures have increasingly been investigated within the research community.In this context,this paper proposes and compares four different computer vision procedures to identify damage by image processing:Otsu method thresholding,Markov random fields segmentation,RGB color detection technique,and K-means clustering algorithm.The first method is based on segmentation by thresholding that returns a binary image from a grayscale image.The Markov random fields technique uses a probabilistic approach to assign labels to model the spatial dependencies in image pixels.The RGB technique uses color detection to evaluate the defect extensions.Finally,K-means algorithm is based on Euclidean distance for clustering of the images.The benefits and limitations of each technique are discussed,and the challenges of using the techniques are highlighted.To show the effectiveness of the described techniques in damage detection of civil infrastructures,a case study is presented.Results show that various types of corrosion and cracks can be detected by image processing techniques making the proposed techniques a suitable tool for the prediction of the damage evolution in civil infrastructures.展开更多
This paper presents an improved approach for detecting copy-move forgery based on singular value decomposition(SVD).It is a block-based method where the image is scanned from left to right and top to down by a sliding...This paper presents an improved approach for detecting copy-move forgery based on singular value decomposition(SVD).It is a block-based method where the image is scanned from left to right and top to down by a sliding window with a determined size.At each step,the SVD is determined.First,the diagonal matrix’s maximum value(norm)is selected(representing the scaling factor for SVD and a fixed value for each set of matrix elements even when rotating thematrix or scaled).Then,the similar norms are grouped,and each leading group is separated into many subgroups(elements of each subgroup are neighbors)according to 8-adjacency(the subgroups for each leading group must be far from others by a specific distance).After that,a weight is assigned for each subgroup to classify the image as forgery or not.Finally,the F1 score of the proposed system is measured,reaching 99.1%.This approach is robust against rotation,scaling,noisy images,and illumination variation.It is compared with other similarmethods and presents very promised results.展开更多
Specific medical data has limitations in that there are not many numbers and it is not standardized.to solve these limitations,it is necessary to study how to efficiently process these limited amounts of data.In this ...Specific medical data has limitations in that there are not many numbers and it is not standardized.to solve these limitations,it is necessary to study how to efficiently process these limited amounts of data.In this paper,deep learning methods for automatically determining cardiovascular diseases are described,and an effective preprocessing method for CT images that can be applied to improve the performance of deep learning was conducted.The cardiac CT images include several parts of the body such as the heart,lungs,spine,and ribs.The preprocessing step proposed in this paper divided CT image data into regions of interest and other regions using K-means clustering and the Grabcut algorithm.We compared the deep learning performance results of original data,data using only K-means clustering,and data using both K-means clustering and the Grabcut algorithm.All data used in this paper were collected at Soonchunhyang University Cheonan Hospital in Korea and the experimental test proceeded with IRB approval.The training was conducted using Resnet 50,VGG,and Inception resnet V2 models,and Resnet 50 had the best accuracy in validation and testing.Through the preprocessing process proposed in this paper,the accuracy of deep learning models was significantly improved by at least 10%and up to 40%.展开更多
There are two types of methods for image segmentation.One is traditional image processing methods,which are sensitive to details and boundaries,yet fail to recognize semantic information.The other is deep learning met...There are two types of methods for image segmentation.One is traditional image processing methods,which are sensitive to details and boundaries,yet fail to recognize semantic information.The other is deep learning methods,which can locate and identify different objects,but boundary identifications are not accurate enough.Both of them cannot generate entire segmentation information.In order to obtain accurate edge detection and semantic information,an Adaptive Boundary and Semantic Composite Segmentation method(ABSCS)is proposed.This method can precisely semantic segment individual objects in large-size aerial images with limited GPU performances.It includes adaptively dividing and modifying the aerial images with the proposed principles and methods,using the deep learning method to semantic segment and preprocess the small divided pieces,using three traditional methods to segment and preprocess original-size aerial images,adaptively selecting traditional results tomodify the boundaries of individual objects in deep learning results,and combining the results of different objects.Individual object semantic segmentation experiments are conducted by using the AeroScapes dataset,and their results are analyzed qualitatively and quantitatively.The experimental results demonstrate that the proposed method can achieve more promising object boundaries than the original deep learning method.This work also demonstrates the advantages of the proposed method in applications of point cloud semantic segmentation and image inpainting.展开更多
The current study provides a quantum calculus-based medical image enhancement technique that dynamically chooses the spatial distribution of image pixel intensity values.The technique focuses on boosting the edges and...The current study provides a quantum calculus-based medical image enhancement technique that dynamically chooses the spatial distribution of image pixel intensity values.The technique focuses on boosting the edges and texture of an image while leaving the smooth areas alone.The brain Magnetic Resonance Imaging(MRI)scans are used to visualize the tumors that have spread throughout the brain in order to gain a better understanding of the stage of brain cancer.Accurately detecting brain cancer is a complex challenge that the medical system faces when diagnosing the disease.To solve this issue,this research offers a quantum calculus-based MRI image enhancement as a pre-processing step for brain cancer diagnosis.The proposed image enhancement approach improves images with low gray level changes by estimating the pixel’s quantum probability.The suggested image enhancement technique is demonstrated to be robust and resistant to major quality changes on a variety ofMRIscan datasets of variable quality.ForMRI scans,the BRISQUE“blind/referenceless image spatial quality evaluator”and the NIQE“natural image quality evaluator”measures were 39.38 and 3.58,respectively.The proposed image enhancement model,according to the data,produces the best image quality ratings,and it may be able to aid medical experts in the diagnosis process.The experimental results were achieved using a publicly available collection of MRI scans.展开更多
Deep learning has been widely used in the field of mammographic image classification owing to its superiority in automatic feature extraction.However,general deep learning models cannot achieve very satisfactory class...Deep learning has been widely used in the field of mammographic image classification owing to its superiority in automatic feature extraction.However,general deep learning models cannot achieve very satisfactory classification results on mammographic images because these models are not specifically designed for mammographic images and do not take the specific traits of these images into account.To exploit the essential discriminant information of mammographic images,we propose a novel classification method based on a convolutional neural network.Specifically,the proposed method designs two branches to extract the discriminative features from mammographic images from the mediolateral oblique and craniocaudal(CC)mammographic views.The features extracted from the two-view mammographic images contain complementary information that enables breast cancer to be more easily distinguished.Moreover,the attention block is introduced to capture the channel-wise information by adjusting the weight of each feature map,which is beneficial to emphasising the important features of mammographic images.Furthermore,we add a penalty term based on the fuzzy cluster algorithm to the cross-entropy function,which improves the generalisation ability of the classification model by maximising the interclass distance and minimising the intraclass distance of the samples.The experimental results on The Digital database for Screening Mammography INbreast and MIAS mammography databases illustrate that the proposed method achieves the best classification performance and is more robust than the compared state-ofthe-art classification methods.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.52090081)the State Key Laboratory of Hydro-science and Hydraulic Engineering(Grant No.2021-KY-04).
文摘Geological discontinuity(GD)plays a pivotal role in determining the catastrophic mechanical failure of jointed rock masses.Accurate and efficient acquisition of GD networks is essential for characterizing and understanding the progressive damage mechanisms of slopes based on monitoring image data.Inspired by recent advances in computer vision,deep learning(DL)models have been widely utilized for image-based fracture identification.The multi-scale characteristics,image resolution and annotation quality of images will cause a scale-space effect(SSE)that makes features indistinguishable from noise,directly affecting the accuracy.However,this effect has not received adequate attention.Herein,we try to address this gap by collecting slope images at various proportional scales and constructing multi-scale datasets using image processing techniques.Next,we quantify the intensity of feature signals using metrics such as peak signal-to-noise ratio(PSNR)and structural similarity(SSIM).Combining these metrics with the scale-space theory,we investigate the influence of the SSE on the differentiation of multi-scale features and the accuracy of recognition.It is found that augmenting the image's detail capacity does not always yield benefits for vision-based recognition models.In light of these observations,we propose a scale hybridization approach based on the diffusion mechanism of scale-space representation.The results show that scale hybridization strengthens the tolerance of multi-scale feature recognition under complex environmental noise interference and significantly enhances the recognition accuracy of GD.It also facilitates the objective understanding,description and analysis of the rock behavior and stability of slopes from the perspective of image data.
基金National Research Foundation of Korea,Grant/Award Numbers:2022R1I1A3069113,RS-2023-00221365Electronics and Telecommunications Research Institute,Grant/Award Number:2014-3-00123。
文摘In recent times,an image enhancement approach,which learns the global transformation function using deep neural networks,has gained attention.However,many existing methods based on this approach have a limitation:their transformation functions are too simple to imitate complex colour transformations between low-quality images and manually retouched high-quality images.In order to address this limitation,a simple yet effective approach for image enhancement is proposed.The proposed algorithm based on the channel-wise intensity transformation is designed.However,this transformation is applied to the learnt embedding space instead of specific colour spaces and then return enhanced features to colours.To this end,the authors define the continuous intensity transformation(CIT)to describe the mapping between input and output intensities on the embedding space.Then,the enhancement network is developed,which produces multi-scale feature maps from input images,derives the set of transformation functions,and performs the CIT to obtain enhanced images.Extensive experiments on the MIT-Adobe 5K dataset demonstrate that the authors’approach improves the performance of conventional intensity transforms on colour space metrics.Specifically,the authors achieved a 3.8%improvement in peak signal-to-noise ratio,a 1.8%improvement in structual similarity index measure,and a 27.5%improvement in learned perceptual image patch similarity.Also,the authors’algorithm outperforms state-of-the-art alternatives on three image enhancement datasets:MIT-Adobe 5K,Low-Light,and Google HDRþ.
基金Project supported by the Scientific Research Fund of Hunan Provincial Education Department,China (Grant No.21A0470)the Natural Science Foundation of Hunan Province,China (Grant No.2023JJ50268)+1 种基金the National Natural Science Foundation of China (Grant Nos.62172268 and 62302289)the Shanghai Science and Technology Project,China (Grant Nos.21JC1402800 and 23YF1416200)。
文摘As a branch of quantum image processing,quantum image scaling has been widely studied.However,most of the existing quantum image scaling algorithms are based on nearest-neighbor interpolation and bilinear interpolation,the quantum version of bicubic interpolation has not yet been studied.In this work,we present the first quantum image scaling scheme for bicubic interpolation based on the novel enhanced quantum representation(NEQR).Our scheme can realize synchronous enlargement and reduction of the image with the size of 2^(n)×2^(n) by integral multiple.Firstly,the image is represented by NEQR and the original image coordinates are obtained through multiple CNOT modules.Then,16 neighborhood pixels are obtained by quantum operation circuits,and the corresponding weights of these pixels are calculated by quantum arithmetic modules.Finally,a quantum matrix operation,instead of a classical convolution operation,is used to realize the sum of convolution of these pixels.Through simulation experiments and complexity analysis,we demonstrate that our scheme achieves exponential speedup over the classical bicubic interpolation algorithm,and has better effect than the quantum version of bilinear interpolation.
基金the National Natural Science Foundation of China(62003298,62163036)the Major Project of Science and Technology of Yunnan Province(202202AD080005,202202AH080009)the Yunnan University Professional Degree Graduate Practice Innovation Fund Project(ZC-22222770)。
文摘Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have been proposed,most of them can only address part of the practical difficulties.An oscillation is heuristically defined as a visually apparent periodic variation.However,manual visual inspection is labor-intensive and prone to missed detection.Convolutional neural networks(CNNs),inspired by animal visual systems,have been raised with powerful feature extraction capabilities.In this work,an exploration of the typical CNN models for visual oscillation detection is performed.Specifically,we tested MobileNet-V1,ShuffleNet-V2,Efficient Net-B0,and GhostNet models,and found that such a visual framework is well-suited for oscillation detection.The feasibility and validity of this framework are verified utilizing extensive numerical and industrial cases.Compared with state-of-theart oscillation detectors,the suggested framework is more straightforward and more robust to noise and mean-nonstationarity.In addition,this framework generalizes well and is capable of handling features that are not present in the training data,such as multiple oscillations and outliers.
文摘Diagnosing various diseases such as glaucoma,age-related macular degeneration,cardiovascular conditions,and diabetic retinopathy involves segmenting retinal blood vessels.The task is particularly challenging when dealing with color fundus images due to issues like non-uniformillumination,low contrast,and variations in vessel appearance,especially in the presence of different pathologies.Furthermore,the speed of the retinal vessel segmentation system is of utmost importance.With the surge of now available big data,the speed of the algorithm becomes increasingly important,carrying almost equivalent weightage to the accuracy of the algorithm.To address these challenges,we present a novel approach for retinal vessel segmentation,leveraging efficient and robust techniques based on multiscale line detection and mathematical morphology.Our algorithm’s performance is evaluated on two publicly available datasets,namely the Digital Retinal Images for Vessel Extraction dataset(DRIVE)and the Structure Analysis of Retina(STARE)dataset.The experimental results demonstrate the effectiveness of our method,withmean accuracy values of 0.9467 forDRIVE and 0.9535 for STARE datasets,aswell as sensitivity values of 0.6952 forDRIVE and 0.6809 for STARE datasets.Notably,our algorithmexhibits competitive performance with state-of-the-art methods.Importantly,it operates at an average speed of 3.73 s per image for DRIVE and 3.75 s for STARE datasets.It is worth noting that these results were achieved using Matlab scripts containing multiple loops.This suggests that the processing time can be further reduced by replacing loops with vectorization.Thus the proposed algorithm can be deployed in real time applications.In summary,our proposed system strikes a fine balance between swift computation and accuracy that is on par with the best available methods in the field.
基金Correspondence:Kejie Huang,Department of Information Science&Electronic Engineering,Zhejiang University,Hangzhou,Zhejiang,China.Email:huangkejie@zju.edu.cnLiyuan Ma,ORCID:https://orcid.org/0000-0002-9492-5324。
文摘Person image generation aims to generate images that maintain the original human appearance in different target poses.Recent works have revealed that the critical element in achieving this task is the alignment of appearance domain and pose domain.Previous alignment methods,such as appearance flow warping,correspondence learning and cross attention,often encounter challenges when it comes to producing fine texture details.These approaches suffer from limitations in accurately estimating appearance flows due to the lack of global receptive field.Alternatively,they can only perform cross-domain alignment on high-level feature maps with small spatial dimensions since the computational complexity increases quadratically with larger feature sizes.In this article,the significance of multi-scale alignment,in both low-level and high-level domains,for ensuring reliable cross-domain alignment of appearance and pose is demonstrated.To this end,a novel and effective method,named Multi-scale Crossdomain Alignment(MCA)is proposed.Firstly,MCA adopts global context aggregation transformer to model multi-scale interaction between pose and appearance inputs,which employs pair-wise window-based cross attention.Furthermore,leveraging the integrated global source information for each target position,MCA applies flexible flow prediction head and point correlation to effectively conduct warping and fusing for final transformed person image generation.Our proposed MCA achieves superior performance on two popular datasets than other methods,which verifies the effectiveness of our approach.
基金sponsored by the National Natural Science Foundation of China(NSFC)under the grant numbers(11773073,11873027,U2031140,11833010)Yunnan Key Laboratory of Solar Physics and Space Science under the number 202205AG070009+1 种基金Yunnan Provincial Science and Technology Department(202103AD50013,202105AB160001,202305AH340002)the GHfund A202302013242 and CAS“Light of West China”Program 202305AS350029.
文摘Strong atmospheric turbulence reduces astronomical seeing,causing speckle images acquired by ground-based solar telescopes to become blurred and distorted.Severe distortion in speckle images impedes image phase deviation in the speckle masking reconstruction method,leading to the appearance of spurious imaging artifacts.Relying only on linear image degradation principles to reconstruct solar images is insufficient.To solve this problem,we propose the multiframe blind deconvolution combined with non-rigid alignment(MFBD-CNRA)method for solar image reconstruction.We consider image distortion caused by atmospheric turbulence and use non-rigid alignment to correct pixel-level distortion,thereby achieving nonlinear constraints to complement image intensity changes.After creating the corrected speckle image,we use the linear method to solve the wavefront phase,obtaining the target image.We verify the effectiveness of our method results,compared with others,using solar observation data from the 1 m new vacuum solar telescope(NVST).This new method successfully reconstructs high-resolution images of solar observations with a Fried parameter r0 of approximately 10 cm,and enhances images at high frequency.When r0 is approximately 5 cm,the new method is even more effective.It reconstructs the edges of solar graining and sunspots,and is greatly enhanced at mid and high frequency compared with other methods.Comparisons confirm the effectiveness of this method,with respect to both nonlinear and linear constraints in solar image reconstruction.This provides a suitable solution for image reconstruction in ground-based solar observations under strong atmospheric turbulence.
基金supported by the Key Research and Development Projects in Shaanxi Province(Program No.2021GY-306)the Innovation Capability Support Program of Shaanxi(Program No.2022KJXX-41)the Key Scientific and Technological Projects of Xi’an(Program No.2022JH-RGZN-0005).
文摘The accumulation of snow and ice on PV modules can have a detrimental impact on power generation,leading to reduced efficiency for prolonged periods.Thus,it becomes imperative to develop an intelligent system capable of accurately assessing the extent of snow and ice coverage on PV modules.To address this issue,the article proposes an innovative ice and snow recognition algorithm that effectively segments the ice and snow areas within the collected images.Furthermore,the algorithm incorporates an analysis of the morphological characteristics of ice and snow coverage on PV modules,allowing for the establishment of a residual ice and snow recognition process.This process utilizes both the external ellipse method and the pixel statistical method to refine the identification process.The effectiveness of the proposed algorithm is validated through extensive testing with isolated and continuous snow area pictures.The results demonstrate the algorithm’s accuracy and reliability in identifying and quantifying residual snow and ice on PV modules.In conclusion,this research presents a valuable method for accurately detecting and quantifying snow and ice coverage on PV modules.This breakthrough is of utmost significance for PV power plants,as it enables predictions of power generation efficiency and facilitates efficient PV maintenance during the challenging winter conditions characterized by snow and ice.By proactively managing snow and ice coverage,PV power plants can optimize energy production and minimize downtime,ensuring a sustainable and reliable renewable energy supply.
文摘Real-time capabilities and computational efficiency are provided by parallel image processing utilizing OpenMP. However, race conditions can affect the accuracy and reliability of the outcomes. This paper highlights the importance of addressing race conditions in parallel image processing, specifically focusing on color inverse filtering using OpenMP. We considered three solutions to solve race conditions, each with distinct characteristics: #pragma omp atomic: Protects individual memory operations for fine-grained control. #pragma omp critical: Protects entire code blocks for exclusive access. #pragma omp parallel sections reduction: Employs a reduction clause for safe aggregation of values across threads. Our findings show that the produced images were unaffected by race condition. However, it becomes evident that solving the race conditions in the code makes it significantly faster, especially when it is executed on multiple cores.
文摘In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks.
文摘Health care is an important part of human life and is a right for everyone. One of the most basic human rights is to receive health care whenever they need it. However, this is simply not an option for everyone due to the social conditions in which some communities live and not everyone has access to it. This paper aims to serve as a reference point and guide for users who are interested in monitoring their health, particularly their blood analysis to be aware of their health condition in an easy way. This study introduces an algorithmic approach for extracting and analyzing Complete Blood Count (CBC) parameters from scanned images. The algorithm employs Optical Character Recognition (OCR) technology to process images containing tabular data, specifically targeting CBC parameter tables. Upon image processing, the algorithm extracts data and identifies CBC parameters and their corresponding values. It evaluates the status (High, Low, or Normal) of each parameter and subsequently presents evaluations, and any potential diagnoses. The primary objective is to automate the extraction and evaluation of CBC parameters, aiding healthcare professionals in swiftly assessing blood analysis results. The algorithmic framework aims to streamline the interpretation of CBC tests, potentially improving efficiency and accuracy in clinical diagnostics.
基金supported by National Natural Science Foundation of China(NSFC)(61976123,62072213)Taishan Young Scholars Program of Shandong Provinceand Key Development Program for Basic Research of Shandong Province(ZR2020ZD44).
文摘In intelligent perception and diagnosis of medical equipment,the visual and morphological changes in retinal vessels are closely related to the severity of cardiovascular diseases(e.g.,diabetes and hypertension).Intelligent auxiliary diagnosis of these diseases depends on the accuracy of the retinal vascular segmentation results.To address this challenge,we design a Dual-Branch-UNet framework,which comprises a Dual-Branch encoder structure for feature extraction based on the traditional U-Net model for medical image segmentation.To be more explicit,we utilize a novel parallel encoder made up of various convolutional modules to enhance the encoder portion of the original U-Net.Then,image features are combined at each layer to produce richer semantic data and the model’s capacity is adjusted to various input images.Meanwhile,in the lower sampling section,we give up pooling and conduct the lower sampling by convolution operation to control step size for information fusion.We also employ an attentionmodule in the decoder stage to filter the image noises so as to lessen the response of irrelevant features.Experiments are verified and compared on the DRIVE and ARIA datasets for retinal vessels segmentation.The proposed Dual-Branch-UNet has proved to be superior to other five typical state-of-the-art methods.
基金This research was supported by the Department of Mining Engineering at the University of Utah.In addition,the lead author wishes to acknowledge the financial support received from the Talent Introduction Project,part of the Elite Program of Shandong University of Science and Technology(No.0104060540171).
文摘This study investigated the correlations between mechanical properties and mineralogy of granite using the digital image processing(DIP) and discrete element method(DEM). The results showed that the X-ray diffraction(XRD)-based DIP method effectively analyzed the mineral composition contents and spatial distributions of granite. During the particle flow code(PFC2D) model calibration phase, the numerical simulation exhibited that the uniaxial compressive strength(UCS) value, elastic modulus(E), and failure pattern of the granite specimen in the UCS test were comparable to the experiment. By establishing 351 sets of numerical models and exploring the impacts of mineral composition on the mechanical properties of granite, it indicated that there was no negative correlation between quartz and feldspar for UCS, tensile strength(σ_(t)), and E. In contrast, mica had a significant negative correlation for UCS, σ_(t), and E. The presence of quartz increased the brittleness of granite, whereas the presence of mica and feldspar increased its ductility in UCS and direct tensile strength(DTS) tests. Varying contents of major mineral compositions in granite showed minor influence on the number of cracks in both UCS and DTS tests.
基金funded by the National Natural Science Foundation of China(41971226,41871357)the Major Research and Development and Achievement Transformation Projects of Qinghai,China(2022-QY-224)the Strategic Priority Research Program of the Chinese Academy of Sciences(XDA28110502,XDA19030303).
文摘A comprehensive understanding of spatial distribution and clustering patterns of gravels is of great significance for ecological restoration and monitoring.However,traditional methods for studying gravels are low-efficiency and have many errors.This study researched the spatial distribution and cluster characteristics of gravels based on digital image processing technology combined with a self-organizing map(SOM)and multivariate statistical methods in the grassland of northern Tibetan Plateau.Moreover,the correlation of morphological parameters of gravels between different cluster groups and the environmental factors affecting gravel distribution were analyzed.The results showed that the morphological characteristics of gravels in northern region(cluster C)and southern region(cluster B)of the Tibetan Plateau were similar,with a low gravel coverage,small gravel diameter,and elongated shape.These regions were mainly distributed in high mountainous areas with large topographic relief.The central region(cluster A)has high coverage of gravels with a larger diameter,mainly distributed in high-altitude plains with smaller undulation.Principal component analysis(PCA)results showed that the gravel distribution of cluster A may be mainly affected by vegetation,while those in clusters B and C could be mainly affected by topography,climate,and soil.The study confirmed that the combination of digital image processing technology and SOM could effectively analyzed the spatial distribution characteristics of gravels,providing a new mode for gravel research.
基金Part of the research leading to these results has received funding from the research project DESDEMONA–Detection of Steel Defects by Enhanced MONitoring and Automated procedure for self-inspection and maintenance (grant agreement number RFCS-2018_800687) supported by EU Call RFCS-2017sponsored by the NATO Science for Peace and Security Programme under grant id. G5924。
文摘Large structures,such as bridges,highways,etc.,need to be inspected to evaluate their actual physical and functional condition,to predict future conditions,and to help decision makers allocating maintenance and rehabilitation resources.The assessment of civil infrastructure condition is carried out through information obtained by inspection and/or monitoring operations.Traditional techniques in structural health monitoring(SHM)involve visual inspection related to inspection standards that can be time-consuming data collection,expensive,labor intensive,and dangerous.To address these limitations,machine vision-based inspection procedures have increasingly been investigated within the research community.In this context,this paper proposes and compares four different computer vision procedures to identify damage by image processing:Otsu method thresholding,Markov random fields segmentation,RGB color detection technique,and K-means clustering algorithm.The first method is based on segmentation by thresholding that returns a binary image from a grayscale image.The Markov random fields technique uses a probabilistic approach to assign labels to model the spatial dependencies in image pixels.The RGB technique uses color detection to evaluate the defect extensions.Finally,K-means algorithm is based on Euclidean distance for clustering of the images.The benefits and limitations of each technique are discussed,and the challenges of using the techniques are highlighted.To show the effectiveness of the described techniques in damage detection of civil infrastructures,a case study is presented.Results show that various types of corrosion and cracks can be detected by image processing techniques making the proposed techniques a suitable tool for the prediction of the damage evolution in civil infrastructures.
文摘This paper presents an improved approach for detecting copy-move forgery based on singular value decomposition(SVD).It is a block-based method where the image is scanned from left to right and top to down by a sliding window with a determined size.At each step,the SVD is determined.First,the diagonal matrix’s maximum value(norm)is selected(representing the scaling factor for SVD and a fixed value for each set of matrix elements even when rotating thematrix or scaled).Then,the similar norms are grouped,and each leading group is separated into many subgroups(elements of each subgroup are neighbors)according to 8-adjacency(the subgroups for each leading group must be far from others by a specific distance).After that,a weight is assigned for each subgroup to classify the image as forgery or not.Finally,the F1 score of the proposed system is measured,reaching 99.1%.This approach is robust against rotation,scaling,noisy images,and illumination variation.It is compared with other similarmethods and presents very promised results.
基金This research was supported under the framework of an international cooperation program managed by the National Research Foundation of Korea(NRF-2019K1A3A1A20093097)supported by the National Key Research and Development Program of China(2019YFE0107800)was supported by the Soonchunhyang University Research Fund。
文摘Specific medical data has limitations in that there are not many numbers and it is not standardized.to solve these limitations,it is necessary to study how to efficiently process these limited amounts of data.In this paper,deep learning methods for automatically determining cardiovascular diseases are described,and an effective preprocessing method for CT images that can be applied to improve the performance of deep learning was conducted.The cardiac CT images include several parts of the body such as the heart,lungs,spine,and ribs.The preprocessing step proposed in this paper divided CT image data into regions of interest and other regions using K-means clustering and the Grabcut algorithm.We compared the deep learning performance results of original data,data using only K-means clustering,and data using both K-means clustering and the Grabcut algorithm.All data used in this paper were collected at Soonchunhyang University Cheonan Hospital in Korea and the experimental test proceeded with IRB approval.The training was conducted using Resnet 50,VGG,and Inception resnet V2 models,and Resnet 50 had the best accuracy in validation and testing.Through the preprocessing process proposed in this paper,the accuracy of deep learning models was significantly improved by at least 10%and up to 40%.
基金funded in part by the Equipment Pre-Research Foundation of China,Grant No.61400010203in part by the Independent Project of the State Key Laboratory of Virtual Reality Technology and Systems.
文摘There are two types of methods for image segmentation.One is traditional image processing methods,which are sensitive to details and boundaries,yet fail to recognize semantic information.The other is deep learning methods,which can locate and identify different objects,but boundary identifications are not accurate enough.Both of them cannot generate entire segmentation information.In order to obtain accurate edge detection and semantic information,an Adaptive Boundary and Semantic Composite Segmentation method(ABSCS)is proposed.This method can precisely semantic segment individual objects in large-size aerial images with limited GPU performances.It includes adaptively dividing and modifying the aerial images with the proposed principles and methods,using the deep learning method to semantic segment and preprocess the small divided pieces,using three traditional methods to segment and preprocess original-size aerial images,adaptively selecting traditional results tomodify the boundaries of individual objects in deep learning results,and combining the results of different objects.Individual object semantic segmentation experiments are conducted by using the AeroScapes dataset,and their results are analyzed qualitatively and quantitatively.The experimental results demonstrate that the proposed method can achieve more promising object boundaries than the original deep learning method.This work also demonstrates the advantages of the proposed method in applications of point cloud semantic segmentation and image inpainting.
文摘The current study provides a quantum calculus-based medical image enhancement technique that dynamically chooses the spatial distribution of image pixel intensity values.The technique focuses on boosting the edges and texture of an image while leaving the smooth areas alone.The brain Magnetic Resonance Imaging(MRI)scans are used to visualize the tumors that have spread throughout the brain in order to gain a better understanding of the stage of brain cancer.Accurately detecting brain cancer is a complex challenge that the medical system faces when diagnosing the disease.To solve this issue,this research offers a quantum calculus-based MRI image enhancement as a pre-processing step for brain cancer diagnosis.The proposed image enhancement approach improves images with low gray level changes by estimating the pixel’s quantum probability.The suggested image enhancement technique is demonstrated to be robust and resistant to major quality changes on a variety ofMRIscan datasets of variable quality.ForMRI scans,the BRISQUE“blind/referenceless image spatial quality evaluator”and the NIQE“natural image quality evaluator”measures were 39.38 and 3.58,respectively.The proposed image enhancement model,according to the data,produces the best image quality ratings,and it may be able to aid medical experts in the diagnosis process.The experimental results were achieved using a publicly available collection of MRI scans.
基金Guangdong Basic and Applied Basic Research Foundation,Grant/Award Number:2019A1515110582Shenzhen Key Laboratory of Visual Object Detection and Recognition,Grant/Award Number:ZDSYS20190902093015527National Natural Science Foundation of China,Grant/Award Number:61876051。
文摘Deep learning has been widely used in the field of mammographic image classification owing to its superiority in automatic feature extraction.However,general deep learning models cannot achieve very satisfactory classification results on mammographic images because these models are not specifically designed for mammographic images and do not take the specific traits of these images into account.To exploit the essential discriminant information of mammographic images,we propose a novel classification method based on a convolutional neural network.Specifically,the proposed method designs two branches to extract the discriminative features from mammographic images from the mediolateral oblique and craniocaudal(CC)mammographic views.The features extracted from the two-view mammographic images contain complementary information that enables breast cancer to be more easily distinguished.Moreover,the attention block is introduced to capture the channel-wise information by adjusting the weight of each feature map,which is beneficial to emphasising the important features of mammographic images.Furthermore,we add a penalty term based on the fuzzy cluster algorithm to the cross-entropy function,which improves the generalisation ability of the classification model by maximising the interclass distance and minimising the intraclass distance of the samples.The experimental results on The Digital database for Screening Mammography INbreast and MIAS mammography databases illustrate that the proposed method achieves the best classification performance and is more robust than the compared state-ofthe-art classification methods.