Breast cancer detection heavily relies on medical imaging, particularly ultrasound, for early diagnosis and effectivetreatment. This research addresses the challenges associated with computer-aided diagnosis (CAD) of ...Breast cancer detection heavily relies on medical imaging, particularly ultrasound, for early diagnosis and effectivetreatment. This research addresses the challenges associated with computer-aided diagnosis (CAD) of breastcancer fromultrasound images. The primary challenge is accurately distinguishing between malignant and benigntumors, complicated by factors such as speckle noise, variable image quality, and the need for precise segmentationand classification. The main objective of the research paper is to develop an advanced methodology for breastultrasound image classification, focusing on speckle noise reduction, precise segmentation, feature extraction, andmachine learning-based classification. A unique approach is introduced that combines Enhanced Speckle ReducedAnisotropic Diffusion (SRAD) filters for speckle noise reduction, U-NET-based segmentation, Genetic Algorithm(GA)-based feature selection, and Random Forest and Bagging Tree classifiers, resulting in a novel and efficientmodel. To test and validate the hybrid model, rigorous experimentations were performed and results state thatthe proposed hybrid model achieved accuracy rate of 99.9%, outperforming other existing techniques, and alsosignificantly reducing computational time. This enhanced accuracy, along with improved sensitivity and specificity,makes the proposed hybrid model a valuable addition to CAD systems in breast cancer diagnosis, ultimatelyenhancing diagnostic accuracy in clinical applications.展开更多
The scientists are dedicated to studying the detection of Alzheimer’s disease onset to find a cure, or at the very least, medication that can slow the progression of the disease. This article explores the effectivene...The scientists are dedicated to studying the detection of Alzheimer’s disease onset to find a cure, or at the very least, medication that can slow the progression of the disease. This article explores the effectiveness of longitudinal data analysis, artificial intelligence, and machine learning approaches based on magnetic resonance imaging and positron emission tomography neuroimaging modalities for progression estimation and the detection of Alzheimer’s disease onset. The significance of feature extraction in highly complex neuroimaging data, identification of vulnerable brain regions, and the determination of the threshold values for plaques, tangles, and neurodegeneration of these regions will extensively be evaluated. Developing automated methods to improve the aforementioned research areas would enable specialists to determine the progression of the disease and find the link between the biomarkers and more accurate detection of Alzheimer’s disease onset.展开更多
Metamaterials(MTM)can enhance the properties of microwaves and also exceed some limitations of devices used in technical practice.Note that the antenna is the element for realizing a microwave imaging(MWI)system since...Metamaterials(MTM)can enhance the properties of microwaves and also exceed some limitations of devices used in technical practice.Note that the antenna is the element for realizing a microwave imaging(MWI)system since it is where signal transmission and absorption occur.Ultra-Wideband(UWB)antenna superstrates with MTM elements to ensure the signal transmitted from the antenna reaches the tumor and is absorbed by the same antenna.The lack of conventional head imaging techniques,for instance,Magnetic Resonance Imaging(MRI)and Computerized Tomography(CT)-scan,has been demonstrated in the paper focusing on the point of failure of these techniques for prompt diagnosis and portable systems.Furthermore,the importance ofMWIhas been addressed elaborately to portray its effectiveness and aptness for a primary tumor diagnosis.Other than that,MTM element designs have been discussed thoroughly based on their performances towards the contributions to the better image resolution of MWI with detailed reasonings.This paper proposes the novel design of a Zeroindex Split RingResonator(SRR)MTMelement superstrate with a UWB antenna implemented in MWI systems for detecting tumor.The novel design of the MTM enables the realization of a high gain of a superstrate UWB antenna with the highest gain of 5.70 dB.Besides that,the MTM imitates the conduct of the zeroreflection phase on the resonance frequency,which does not exist.An antenna with an MTM unit is of a 7×4 and 10×5 Zero-index SRR MTM element that acts as a superstrate plane to the antenna.Apart from that,Rogers(RT5880)substrate material is employed to fabricate the designed MTM unit cell,with the following characteristics:0.51mm thickness,the loss tangent of 0.02,as well as the relative permittivity of 2.2,with Computer Simulation Technology(CST)performing the simulation and design.Both MTM unit cells of 7×4 and 10×5 attained 0°with respect to the reflection phase at the 2.70 GHz frequency band.The first design,MTM Antenna Design 1,consists of a 7×4 MTM unit cell that observed a rise of 5.70 dB with a return loss(S11)−20.007 dB at 2.70 GHz frequency.The second design,MTM Antenna Design 2,consists of 10×5 MTM unit cells that recorded a gain of 5.66 dB,having the return loss(S11)−19.734 dB at 2.70 GHz frequency.Comparing these two MTM elements superstrates with the antenna,one can notice that the 7×4 MTM element shape has a low number of the unit cell with high gain and is a better choice than the 10×5 MTM element in realizing MTM element superstrates antenna for MWI.展开更多
This paper proposed a high-sensitivity phase imaging eddy current magneto-optical (PI-ECMO) system for carbon fiber reinforced polymer (CFRP) defect detection. In contrast to other eddy current-based detection systems...This paper proposed a high-sensitivity phase imaging eddy current magneto-optical (PI-ECMO) system for carbon fiber reinforced polymer (CFRP) defect detection. In contrast to other eddy current-based detection systems, the proposed system employs a fixed position excitation coil while enabling the detection point to move within the detection region. This configuration effectively mitigates the interference caused by the lift-off effect, which is commonly observed in systems with moving excitation coils. Correspondingly, the relationship between the defect characteristics (orientation and position) and the surface vertical magnetic field distribution (amplitude and phase) is studied in detail by theoretical analysis and numerical simulations. Experiments conducted on woven CFRP plates demonstrate that the designed PI-ECMO system is capable of effectively detecting both surface and internal cracks, as well as impact defects. The excitation current is significantly reduced compared with traditional eddy current magneto-optical (ECMO) systems.展开更多
Diagnosing various diseases such as glaucoma,age-related macular degeneration,cardiovascular conditions,and diabetic retinopathy involves segmenting retinal blood vessels.The task is particularly challenging when deal...Diagnosing various diseases such as glaucoma,age-related macular degeneration,cardiovascular conditions,and diabetic retinopathy involves segmenting retinal blood vessels.The task is particularly challenging when dealing with color fundus images due to issues like non-uniformillumination,low contrast,and variations in vessel appearance,especially in the presence of different pathologies.Furthermore,the speed of the retinal vessel segmentation system is of utmost importance.With the surge of now available big data,the speed of the algorithm becomes increasingly important,carrying almost equivalent weightage to the accuracy of the algorithm.To address these challenges,we present a novel approach for retinal vessel segmentation,leveraging efficient and robust techniques based on multiscale line detection and mathematical morphology.Our algorithm’s performance is evaluated on two publicly available datasets,namely the Digital Retinal Images for Vessel Extraction dataset(DRIVE)and the Structure Analysis of Retina(STARE)dataset.The experimental results demonstrate the effectiveness of our method,withmean accuracy values of 0.9467 forDRIVE and 0.9535 for STARE datasets,aswell as sensitivity values of 0.6952 forDRIVE and 0.6809 for STARE datasets.Notably,our algorithmexhibits competitive performance with state-of-the-art methods.Importantly,it operates at an average speed of 3.73 s per image for DRIVE and 3.75 s for STARE datasets.It is worth noting that these results were achieved using Matlab scripts containing multiple loops.This suggests that the processing time can be further reduced by replacing loops with vectorization.Thus the proposed algorithm can be deployed in real time applications.In summary,our proposed system strikes a fine balance between swift computation and accuracy that is on par with the best available methods in the field.展开更多
In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,...In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,diagnosis and evaluation of kidney and urinary tract disease,providing insight into the specific type and severity.However,manual urine sediment examination is labor-intensive,time-consuming,and subjective.Traditional machine learning based object detection methods require hand-crafted features for localization and classification,which have poor generalization capabilities and are difficult to quickly and accurately detect the number of urine sediments.Deep learning based object detection methods have the potential to address the challenges mentioned above,but these methods require access to large urine sediment image datasets.Unfortunately,only a limited number of publicly available urine sediment datasets are currently available.To alleviate the lack of urine sediment datasets in medical image analysis,we propose a new dataset named UriSed2K,which contains 2465 high-quality images annotated with expert guidance.Two main challenges are associated with our dataset:a large number of small objects and the occlusion between these small objects.Our manuscript focuses on applying deep learning object detection methods to the urine sediment dataset and addressing the challenges presented by this dataset.Specifically,our goal is to improve the accuracy and efficiency of the detection algorithm and,in doing so,provide medical professionals with an automatic detector that saves time and effort.We propose an improved lightweight one-stage object detection algorithm called Discriminatory-YOLO.The proposed algorithm comprises a local context attention module and a global background suppression module,which aid the detector in distinguishing urine sediment features in the image.The local context attention module captures context information beyond the object region,while the global background suppression module emphasizes objects in uninformative backgrounds.We comprehensively evaluate our method on the UriSed2K dataset,which includes seven categories of urine sediments,such as erythrocytes(red blood cells),leukocytes(white blood cells),epithelial cells,crystals,mycetes,broken erythrocytes,and broken leukocytes,achieving the best average precision(AP)of 95.3%while taking only 10 ms per image.The source code and dataset are available at https://github.com/binghuiwu98/discriminatoryyolov5.展开更多
Recognizing handwritten characters remains a critical and formidable challenge within the realm of computervision. Although considerable strides have been made in enhancing English handwritten character recognitionthr...Recognizing handwritten characters remains a critical and formidable challenge within the realm of computervision. Although considerable strides have been made in enhancing English handwritten character recognitionthrough various techniques, deciphering Arabic handwritten characters is particularly intricate. This complexityarises from the diverse array of writing styles among individuals, coupled with the various shapes that a singlecharacter can take when positioned differently within document images, rendering the task more perplexing. Inthis study, a novel segmentation method for Arabic handwritten scripts is suggested. This work aims to locatethe local minima of the vertical and diagonal word image densities to precisely identify the segmentation pointsbetween the cursive letters. The proposed method starts with pre-processing the word image without affectingits main features, then calculates the directions pixel density of the word image by scanning it vertically and fromangles 30° to 90° to count the pixel density fromall directions and address the problem of overlapping letters, whichis a commonly attitude in writing Arabic texts by many people. Local minima and thresholds are also determinedto identify the ideal segmentation area. The proposed technique is tested on samples obtained fromtwo datasets: Aself-curated image dataset and the IFN/ENIT dataset. The results demonstrate that the proposed method achievesa significant improvement in the proportions of cursive segmentation of 92.96% on our dataset, as well as 89.37%on the IFN/ENIT dataset.展开更多
A novel image encryption scheme based on parallel compressive sensing and edge detection embedding technology is proposed to improve visual security. Firstly, the plain image is sparsely represented using the discrete...A novel image encryption scheme based on parallel compressive sensing and edge detection embedding technology is proposed to improve visual security. Firstly, the plain image is sparsely represented using the discrete wavelet transform.Then, the coefficient matrix is scrambled and compressed to obtain a size-reduced image using the Fisher–Yates shuffle and parallel compressive sensing. Subsequently, to increase the security of the proposed algorithm, the compressed image is re-encrypted through permutation and diffusion to obtain a noise-like secret image. Finally, an adaptive embedding method based on edge detection for different carrier images is proposed to generate a visually meaningful cipher image. To improve the plaintext sensitivity of the algorithm, the counter mode is combined with the hash function to generate keys for chaotic systems. Additionally, an effective permutation method is designed to scramble the pixels of the compressed image in the re-encryption stage. The simulation results and analyses demonstrate that the proposed algorithm performs well in terms of visual security and decryption quality.展开更多
The intelligent detection technology driven by X-ray images and deep learning represents the forefront of advanced techniques and development trends in flaw detection and automated evaluation of light alloy castings.H...The intelligent detection technology driven by X-ray images and deep learning represents the forefront of advanced techniques and development trends in flaw detection and automated evaluation of light alloy castings.However,the efficacy of deep learning models hinges upon a substantial abundance of flaw samples.The existing research on X-ray image augmentation for flaw detection suffers from shortcomings such as poor diversity of flaw samples and low reliability of quality evaluation.To this end,a novel approach was put forward,which involves the creation of the Interpolation-Deep Convolutional Generative Adversarial Network(I-DCGAN)for flaw detection image generation and a comprehensive evaluation algorithm named TOPSIS-IFP.I-DCGAN enables the generation of high-resolution,diverse simulated images with multiple appearances,achieving an improvement in sample diversity and quality while maintaining a relatively lower computational complexity.TOPSIS-IFP facilitates multi-dimensional quality evaluation,including aspects such as diversity,authenticity,image distribution difference,and image distortion degree.The results indicate that the X-ray radiographic images of magnesium and aluminum alloy castings achieve optimal performance when trained up to the 800th and 600th epochs,respectively.The TOPSIS-IFP value reaches 78.7%and 73.8%similarity to the ideal solution,respectively.Compared to single index evaluation,the TOPSIS-IFP algorithm achieves higher-quality simulated images at the optimal training epoch.This approach successfully mitigates the issue of unreliable quality associated with single index evaluation.The image generation and comprehensive quality evaluation method developed in this paper provides a novel approach for image augmentation in flaw recognition,holding significant importance for enhancing the robustness of subsequent flaw recognition networks.展开更多
Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unman...Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.展开更多
BACKGROUND Deep learning provides an efficient automatic image recognition method for small bowel(SB)capsule endoscopy(CE)that can assist physicians in diagnosis.However,the existing deep learning models present some ...BACKGROUND Deep learning provides an efficient automatic image recognition method for small bowel(SB)capsule endoscopy(CE)that can assist physicians in diagnosis.However,the existing deep learning models present some unresolved challenges.AIM To propose a novel and effective classification and detection model to automatically identify various SB lesions and their bleeding risks,and label the lesions accurately so as to enhance the diagnostic efficiency of physicians and the ability to identify high-risk bleeding groups.METHODS The proposed model represents a two-stage method that combined image classification with object detection.First,we utilized the improved ResNet-50 classification model to classify endoscopic images into SB lesion images,normal SB mucosa images,and invalid images.Then,the improved YOLO-V5 detection model was utilized to detect the type of lesion and its risk of bleeding,and the location of the lesion was marked.We constructed training and testing sets and compared model-assisted reading with physician reading.RESULTS The accuracy of the model constructed in this study reached 98.96%,which was higher than the accuracy of other systems using only a single module.The sensitivity,specificity,and accuracy of the model-assisted reading detection of all images were 99.17%,99.92%,and 99.86%,which were significantly higher than those of the endoscopists’diagnoses.The image processing time of the model was 48 ms/image,and the image processing time of the physicians was 0.40±0.24 s/image(P<0.001).CONCLUSION The deep learning model of image classification combined with object detection exhibits a satisfactory diagnostic effect on a variety of SB lesions and their bleeding risks in CE images,which enhances the diagnostic efficiency of physicians and improves the ability of physicians to identify high-risk bleeding groups.展开更多
With the explosive growth of false information on social media platforms, the automatic detection of multimodalfalse information has received increasing attention. Recent research has significantly contributed to mult...With the explosive growth of false information on social media platforms, the automatic detection of multimodalfalse information has received increasing attention. Recent research has significantly contributed to multimodalinformation exchange and fusion, with many methods attempting to integrate unimodal features to generatemultimodal news representations. However, they still need to fully explore the hierarchical and complex semanticcorrelations between different modal contents, severely limiting their performance detecting multimodal falseinformation. This work proposes a two-stage detection framework for multimodal false information detection,called ASMFD, which is based on image aesthetic similarity to segment and explores the consistency andinconsistency features of images and texts. Specifically, we first use the Contrastive Language-Image Pre-training(CLIP) model to learn the relationship between text and images through label awareness and train an imageaesthetic attribute scorer using an aesthetic attribute dataset. Then, we calculate the aesthetic similarity betweenthe image and related images and use this similarity as a threshold to divide the multimodal correlation matrixinto consistency and inconsistencymatrices. Finally, the fusionmodule is designed to identify essential features fordetectingmultimodal false information. In extensive experiments on four datasets, the performance of the ASMFDis superior to state-of-the-art baseline methods.展开更多
This paper addresses the common orthopedic trauma of spinal vertebral fractures and aims to enhance doctors’diagnostic efficiency.Therefore,a deep-learning-based automated diagnostic systemwithmulti-label segmentatio...This paper addresses the common orthopedic trauma of spinal vertebral fractures and aims to enhance doctors’diagnostic efficiency.Therefore,a deep-learning-based automated diagnostic systemwithmulti-label segmentation is proposed to recognize the condition of vertebral fractures.The whole spine Computed Tomography(CT)image is segmented into the fracture,normal,and background using U-Net,and the fracture degree of each vertebra is evaluated(Genant semi-qualitative evaluation).The main work of this paper includes:First,based on the spatial configuration network(SCN)structure,U-Net is used instead of the SCN feature extraction network.The attention mechanismandthe residual connectionbetweenthe convolutional layers are added in the local network(LN)stage.Multiple filtering is added in the global network(GN)stage,and each layer of the LN decoder feature map is filtered separately using dot product,and the filtered features are re-convolved to obtain the GN output heatmap.Second,a network model with improved SCN(M-SCN)helps automatically localize the center-of-mass position of each vertebra,and the voxels around each localized vertebra were clipped,eliminating a large amount of redundant information(e.g.,background and other interfering vertebrae)and keeping the vertebrae to be segmented in the center of the image.Multilabel segmentation of the clipped portion was subsequently performed using U-Net.This paper uses VerSe’19,VerSe’20(using only data containing vertebral fractures),and private data(provided by Guizhou Orthopedic Hospital)for model training and evaluation.Compared with the original SCN network,the M-SCN reduced the prediction error rate by 1.09%and demonstrated the effectiveness of the improvement in ablation experiments.In the vertebral segmentation experiment,the Dice Similarity Coefficient(DSC)index reached 93.50%and the Maximum Symmetry Surface Distance(MSSD)index was 4.962 mm,with accuracy and recall of 95.82%and 91.73%,respectively.Fractured vertebrae were also marked as red and normal vertebrae were marked as white in the experiment,and the semi-qualitative assessment results of Genant were provided,as well as the results of spinal localization visualization and 3D reconstructed views of the spine to analyze the actual predictive ability of the model.It provides a promising tool for vertebral fracture detection.展开更多
Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have b...Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have been proposed,most of them can only address part of the practical difficulties.An oscillation is heuristically defined as a visually apparent periodic variation.However,manual visual inspection is labor-intensive and prone to missed detection.Convolutional neural networks(CNNs),inspired by animal visual systems,have been raised with powerful feature extraction capabilities.In this work,an exploration of the typical CNN models for visual oscillation detection is performed.Specifically,we tested MobileNet-V1,ShuffleNet-V2,Efficient Net-B0,and GhostNet models,and found that such a visual framework is well-suited for oscillation detection.The feasibility and validity of this framework are verified utilizing extensive numerical and industrial cases.Compared with state-of-theart oscillation detectors,the suggested framework is more straightforward and more robust to noise and mean-nonstationarity.In addition,this framework generalizes well and is capable of handling features that are not present in the training data,such as multiple oscillations and outliers.展开更多
In rice production,the prevention and management of pests and diseases have always received special attention.Traditional methods require human experts,which is costly and time-consuming.Due to the complexity of the s...In rice production,the prevention and management of pests and diseases have always received special attention.Traditional methods require human experts,which is costly and time-consuming.Due to the complexity of the structure of rice diseases and pests,quickly and reliably recognizing and locating them is difficult.Recently,deep learning technology has been employed to detect and identify rice diseases and pests.This paper introduces common publicly available datasets;summarizes the applications on rice diseases and pests from the aspects of image recognition,object detection,image segmentation,attention mechanism,and few-shot learning methods according to the network structure differences;and compares the performances of existing studies.Finally,the current issues and challenges are explored fromthe perspective of data acquisition,data processing,and application,providing possible solutions and suggestions.This study aims to review various DL models and provide improved insight into DL techniques and their cutting-edge progress in the prevention and management of rice diseases and pests.展开更多
Camouflaged people are extremely expert in actively concealing themselves by effectively utilizing cover and the surrounding environment. Despite advancements in optical detection capabilities through imaging systems,...Camouflaged people are extremely expert in actively concealing themselves by effectively utilizing cover and the surrounding environment. Despite advancements in optical detection capabilities through imaging systems, including spectral, polarization, and infrared technologies, there is still a lack of effective real-time method for accurately detecting small-size and high-efficient camouflaged people in complex real-world scenes. Here, this study proposes a snapshot multispectral image-based camouflaged detection model, multispectral YOLO(MS-YOLO), which utilizes the SPD-Conv and Sim AM modules to effectively represent targets and suppress background interference by exploiting the spatial-spectral target information. Besides, the study constructs the first real-shot multispectral camouflaged people dataset(MSCPD), which encompasses diverse scenes, target scales, and attitudes. To minimize information redundancy, MS-YOLO selects an optimal subset of 12 bands with strong feature representation and minimal inter-band correlation as input. Through experiments on the MSCPD, MS-YOLO achieves a mean Average Precision of 94.31% and real-time detection at 65 frames per second, which confirms the effectiveness and efficiency of our method in detecting camouflaged people in various typical desert and forest scenes. Our approach offers valuable support to improve the perception capabilities of unmanned aerial vehicles in detecting enemy forces and rescuing personnel in battlefield.展开更多
Artificial intelligence(AI)is making significant strides in revolutionizing the detection of Barrett's esophagus(BE),a precursor to esophageal adenocarcinoma.In the research article by Tsai et al,researchers utili...Artificial intelligence(AI)is making significant strides in revolutionizing the detection of Barrett's esophagus(BE),a precursor to esophageal adenocarcinoma.In the research article by Tsai et al,researchers utilized endoscopic images to train an AI model,challenging the traditional distinction between endoscopic and histological BE.This approach yielded remarkable results,with the AI system achieving an accuracy of 94.37%,sensitivity of 94.29%,and specificity of 94.44%.The study's extensive dataset enhances the AI model's practicality,offering valuable support to endoscopists by minimizing unnecessary biopsies.However,questions about the applicability to different endoscopic systems remain.The study underscores the potential of AI in BE detection while highlighting the need for further research to assess its adaptability to diverse clinical settings.展开更多
In pursuit of cost-effective manufacturing,enterprises are increasingly adopting the practice of utilizing recycled semiconductor chips.To ensure consistent chip orientation during packaging,a circular marker on the f...In pursuit of cost-effective manufacturing,enterprises are increasingly adopting the practice of utilizing recycled semiconductor chips.To ensure consistent chip orientation during packaging,a circular marker on the front side is employed for pin alignment following successful functional testing.However,recycled chips often exhibit substantial surface wear,and the identification of the relatively small marker proves challenging.Moreover,the complexity of generic target detection algorithms hampers seamless deployment.Addressing these issues,this paper introduces a lightweight YOLOv8s-based network tailored for detecting markings on recycled chips,termed Van-YOLOv8.Initially,to alleviate the influence of diminutive,low-resolution markings on the precision of deep learning models,we utilize an upscaling approach for enhanced resolution.This technique relies on the Super-Resolution Generative Adversarial Network with Extended Training(SRGANext)network,facilitating the reconstruction of high-fidelity images that align with input specifications.Subsequently,we replace the original YOLOv8smodel’s backbone feature extraction network with the lightweight VanillaNetwork(VanillaNet),simplifying the branch structure to reduce network parameters.Finally,a Hybrid Attention Mechanism(HAM)is implemented to capture essential details from input images,improving feature representation while concurrently expediting model inference speed.Experimental results demonstrate that the Van-YOLOv8 network outperforms the original YOLOv8s on a recycled chip dataset in various aspects.Significantly,it demonstrates superiority in parameter count,computational intricacy,precision in identifying targets,and speed when compared to certain prevalent algorithms in the current landscape.The proposed approach proves promising for real-time detection of recycled chips in practical factory settings.展开更多
We propose a novel image segmentation algorithm to tackle the challenge of limited recognition and segmentation performance in identifying welding seam images during robotic intelligent operations.Initially,to enhance...We propose a novel image segmentation algorithm to tackle the challenge of limited recognition and segmentation performance in identifying welding seam images during robotic intelligent operations.Initially,to enhance the capability of deep neural networks in extracting geometric attributes from depth images,we developed a novel deep geometric convolution operator(DGConv).DGConv is utilized to construct a deep local geometric feature extraction module,facilitating a more comprehensive exploration of the intrinsic geometric information within depth images.Secondly,we integrate the newly proposed deep geometric feature module with the Fully Convolutional Network(FCN8)to establish a high-performance deep neural network algorithm tailored for depth image segmentation.Concurrently,we enhance the FCN8 detection head by separating the segmentation and classification processes.This enhancement significantly boosts the network’s overall detection capability.Thirdly,for a comprehensive assessment of our proposed algorithm and its applicability in real-world industrial settings,we curated a line-scan image dataset featuring weld seams.This dataset,named the Standardized Linear Depth Profile(SLDP)dataset,was collected from actual industrial sites where autonomous robots are in operation.Ultimately,we conducted experiments utilizing the SLDP dataset,achieving an average accuracy of 92.7%.Our proposed approach exhibited a remarkable performance improvement over the prior method on the identical dataset.Moreover,we have successfully deployed the proposed algorithm in genuine industrial environments,fulfilling the prerequisites of unmanned robot operations.展开更多
Cardiovascular disease is the leading cause of death globally.This disease causes loss of heart muscles and is also responsible for the death of heart cells,sometimes damaging their functionality.A person’s life may ...Cardiovascular disease is the leading cause of death globally.This disease causes loss of heart muscles and is also responsible for the death of heart cells,sometimes damaging their functionality.A person’s life may depend on receiving timely assistance as soon as possible.Thus,minimizing the death ratio can be achieved by early detection of heart attack(HA)symptoms.In the United States alone,an estimated 610,000 people die fromheart attacks each year,accounting for one in every four fatalities.However,by identifying and reporting heart attack symptoms early on,it is possible to reduce damage and save many lives significantly.Our objective is to devise an algorithm aimed at helping individuals,particularly elderly individuals living independently,to safeguard their lives.To address these challenges,we employ deep learning techniques.We have utilized a vision transformer(ViT)to address this problem.However,it has a significant overhead cost due to its memory consumption and computational complexity because of scaling dot-product attention.Also,since transformer performance typically relies on large-scale or adequate data,adapting ViT for smaller datasets is more challenging.In response,we propose a three-in-one steam model,theMulti-Head Attention Vision Hybrid(MHAVH).Thismodel integrates a real-time posture recognition framework to identify chest pain postures indicative of heart attacks using transfer learning techniques,such as ResNet-50 and VGG-16,renowned for their robust feature extraction capabilities.By incorporatingmultiple heads into the vision transformer to generate additional metrics and enhance heart-detection capabilities,we leverage a 2019 posture-based dataset comprising RGB images,a novel creation by the author that marks the first dataset tailored for posture-based heart attack detection.Given the limited online data availability,we segmented this dataset into gender categories(male and female)and conducted testing on both segmented and original datasets.The training accuracy of our model reached an impressive 99.77%.Upon testing,the accuracy for male and female datasets was recorded at 92.87%and 75.47%,respectively.The combined dataset accuracy is 93.96%,showcasing a commendable performance overall.Our proposed approach demonstrates versatility in accommodating small and large datasets,offering promising prospects for real-world applications.展开更多
基金funded through Researchers Supporting Project Number(RSPD2024R996)King Saud University,Riyadh,Saudi Arabia。
文摘Breast cancer detection heavily relies on medical imaging, particularly ultrasound, for early diagnosis and effectivetreatment. This research addresses the challenges associated with computer-aided diagnosis (CAD) of breastcancer fromultrasound images. The primary challenge is accurately distinguishing between malignant and benigntumors, complicated by factors such as speckle noise, variable image quality, and the need for precise segmentationand classification. The main objective of the research paper is to develop an advanced methodology for breastultrasound image classification, focusing on speckle noise reduction, precise segmentation, feature extraction, andmachine learning-based classification. A unique approach is introduced that combines Enhanced Speckle ReducedAnisotropic Diffusion (SRAD) filters for speckle noise reduction, U-NET-based segmentation, Genetic Algorithm(GA)-based feature selection, and Random Forest and Bagging Tree classifiers, resulting in a novel and efficientmodel. To test and validate the hybrid model, rigorous experimentations were performed and results state thatthe proposed hybrid model achieved accuracy rate of 99.9%, outperforming other existing techniques, and alsosignificantly reducing computational time. This enhanced accuracy, along with improved sensitivity and specificity,makes the proposed hybrid model a valuable addition to CAD systems in breast cancer diagnosis, ultimatelyenhancing diagnostic accuracy in clinical applications.
文摘The scientists are dedicated to studying the detection of Alzheimer’s disease onset to find a cure, or at the very least, medication that can slow the progression of the disease. This article explores the effectiveness of longitudinal data analysis, artificial intelligence, and machine learning approaches based on magnetic resonance imaging and positron emission tomography neuroimaging modalities for progression estimation and the detection of Alzheimer’s disease onset. The significance of feature extraction in highly complex neuroimaging data, identification of vulnerable brain regions, and the determination of the threshold values for plaques, tangles, and neurodegeneration of these regions will extensively be evaluated. Developing automated methods to improve the aforementioned research areas would enable specialists to determine the progression of the disease and find the link between the biomarkers and more accurate detection of Alzheimer’s disease onset.
基金the Fundamental Research Grant Scheme (FRGS/1/2018/ICT06/UNIMAP/02/1)of the Ministry of Higher Education of Malaysia.
文摘Metamaterials(MTM)can enhance the properties of microwaves and also exceed some limitations of devices used in technical practice.Note that the antenna is the element for realizing a microwave imaging(MWI)system since it is where signal transmission and absorption occur.Ultra-Wideband(UWB)antenna superstrates with MTM elements to ensure the signal transmitted from the antenna reaches the tumor and is absorbed by the same antenna.The lack of conventional head imaging techniques,for instance,Magnetic Resonance Imaging(MRI)and Computerized Tomography(CT)-scan,has been demonstrated in the paper focusing on the point of failure of these techniques for prompt diagnosis and portable systems.Furthermore,the importance ofMWIhas been addressed elaborately to portray its effectiveness and aptness for a primary tumor diagnosis.Other than that,MTM element designs have been discussed thoroughly based on their performances towards the contributions to the better image resolution of MWI with detailed reasonings.This paper proposes the novel design of a Zeroindex Split RingResonator(SRR)MTMelement superstrate with a UWB antenna implemented in MWI systems for detecting tumor.The novel design of the MTM enables the realization of a high gain of a superstrate UWB antenna with the highest gain of 5.70 dB.Besides that,the MTM imitates the conduct of the zeroreflection phase on the resonance frequency,which does not exist.An antenna with an MTM unit is of a 7×4 and 10×5 Zero-index SRR MTM element that acts as a superstrate plane to the antenna.Apart from that,Rogers(RT5880)substrate material is employed to fabricate the designed MTM unit cell,with the following characteristics:0.51mm thickness,the loss tangent of 0.02,as well as the relative permittivity of 2.2,with Computer Simulation Technology(CST)performing the simulation and design.Both MTM unit cells of 7×4 and 10×5 attained 0°with respect to the reflection phase at the 2.70 GHz frequency band.The first design,MTM Antenna Design 1,consists of a 7×4 MTM unit cell that observed a rise of 5.70 dB with a return loss(S11)−20.007 dB at 2.70 GHz frequency.The second design,MTM Antenna Design 2,consists of 10×5 MTM unit cells that recorded a gain of 5.66 dB,having the return loss(S11)−19.734 dB at 2.70 GHz frequency.Comparing these two MTM elements superstrates with the antenna,one can notice that the 7×4 MTM element shape has a low number of the unit cell with high gain and is a better choice than the 10×5 MTM element in realizing MTM element superstrates antenna for MWI.
基金the National Natural Science Foundation of China under Grants No.U2030205,No.62003075,No.61903065,and No.62003074Sichuan Science and Technology Planning Project under Grant No.2022JDJQ0040.
文摘This paper proposed a high-sensitivity phase imaging eddy current magneto-optical (PI-ECMO) system for carbon fiber reinforced polymer (CFRP) defect detection. In contrast to other eddy current-based detection systems, the proposed system employs a fixed position excitation coil while enabling the detection point to move within the detection region. This configuration effectively mitigates the interference caused by the lift-off effect, which is commonly observed in systems with moving excitation coils. Correspondingly, the relationship between the defect characteristics (orientation and position) and the surface vertical magnetic field distribution (amplitude and phase) is studied in detail by theoretical analysis and numerical simulations. Experiments conducted on woven CFRP plates demonstrate that the designed PI-ECMO system is capable of effectively detecting both surface and internal cracks, as well as impact defects. The excitation current is significantly reduced compared with traditional eddy current magneto-optical (ECMO) systems.
文摘Diagnosing various diseases such as glaucoma,age-related macular degeneration,cardiovascular conditions,and diabetic retinopathy involves segmenting retinal blood vessels.The task is particularly challenging when dealing with color fundus images due to issues like non-uniformillumination,low contrast,and variations in vessel appearance,especially in the presence of different pathologies.Furthermore,the speed of the retinal vessel segmentation system is of utmost importance.With the surge of now available big data,the speed of the algorithm becomes increasingly important,carrying almost equivalent weightage to the accuracy of the algorithm.To address these challenges,we present a novel approach for retinal vessel segmentation,leveraging efficient and robust techniques based on multiscale line detection and mathematical morphology.Our algorithm’s performance is evaluated on two publicly available datasets,namely the Digital Retinal Images for Vessel Extraction dataset(DRIVE)and the Structure Analysis of Retina(STARE)dataset.The experimental results demonstrate the effectiveness of our method,withmean accuracy values of 0.9467 forDRIVE and 0.9535 for STARE datasets,aswell as sensitivity values of 0.6952 forDRIVE and 0.6809 for STARE datasets.Notably,our algorithmexhibits competitive performance with state-of-the-art methods.Importantly,it operates at an average speed of 3.73 s per image for DRIVE and 3.75 s for STARE datasets.It is worth noting that these results were achieved using Matlab scripts containing multiple loops.This suggests that the processing time can be further reduced by replacing loops with vectorization.Thus the proposed algorithm can be deployed in real time applications.In summary,our proposed system strikes a fine balance between swift computation and accuracy that is on par with the best available methods in the field.
基金This work was partially supported by the National Natural Science Foundation of China(Grant Nos.61906168,U20A20171)Zhejiang Provincial Natural Science Foundation of China(Grant Nos.LY23F020023,LY21F020027)Construction of Hubei Provincial Key Laboratory for Intelligent Visual Monitoring of Hydropower Projects(Grant Nos.2022SDSJ01).
文摘In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,diagnosis and evaluation of kidney and urinary tract disease,providing insight into the specific type and severity.However,manual urine sediment examination is labor-intensive,time-consuming,and subjective.Traditional machine learning based object detection methods require hand-crafted features for localization and classification,which have poor generalization capabilities and are difficult to quickly and accurately detect the number of urine sediments.Deep learning based object detection methods have the potential to address the challenges mentioned above,but these methods require access to large urine sediment image datasets.Unfortunately,only a limited number of publicly available urine sediment datasets are currently available.To alleviate the lack of urine sediment datasets in medical image analysis,we propose a new dataset named UriSed2K,which contains 2465 high-quality images annotated with expert guidance.Two main challenges are associated with our dataset:a large number of small objects and the occlusion between these small objects.Our manuscript focuses on applying deep learning object detection methods to the urine sediment dataset and addressing the challenges presented by this dataset.Specifically,our goal is to improve the accuracy and efficiency of the detection algorithm and,in doing so,provide medical professionals with an automatic detector that saves time and effort.We propose an improved lightweight one-stage object detection algorithm called Discriminatory-YOLO.The proposed algorithm comprises a local context attention module and a global background suppression module,which aid the detector in distinguishing urine sediment features in the image.The local context attention module captures context information beyond the object region,while the global background suppression module emphasizes objects in uninformative backgrounds.We comprehensively evaluate our method on the UriSed2K dataset,which includes seven categories of urine sediments,such as erythrocytes(red blood cells),leukocytes(white blood cells),epithelial cells,crystals,mycetes,broken erythrocytes,and broken leukocytes,achieving the best average precision(AP)of 95.3%while taking only 10 ms per image.The source code and dataset are available at https://github.com/binghuiwu98/discriminatoryyolov5.
文摘Recognizing handwritten characters remains a critical and formidable challenge within the realm of computervision. Although considerable strides have been made in enhancing English handwritten character recognitionthrough various techniques, deciphering Arabic handwritten characters is particularly intricate. This complexityarises from the diverse array of writing styles among individuals, coupled with the various shapes that a singlecharacter can take when positioned differently within document images, rendering the task more perplexing. Inthis study, a novel segmentation method for Arabic handwritten scripts is suggested. This work aims to locatethe local minima of the vertical and diagonal word image densities to precisely identify the segmentation pointsbetween the cursive letters. The proposed method starts with pre-processing the word image without affectingits main features, then calculates the directions pixel density of the word image by scanning it vertically and fromangles 30° to 90° to count the pixel density fromall directions and address the problem of overlapping letters, whichis a commonly attitude in writing Arabic texts by many people. Local minima and thresholds are also determinedto identify the ideal segmentation area. The proposed technique is tested on samples obtained fromtwo datasets: Aself-curated image dataset and the IFN/ENIT dataset. The results demonstrate that the proposed method achievesa significant improvement in the proportions of cursive segmentation of 92.96% on our dataset, as well as 89.37%on the IFN/ENIT dataset.
基金supported by the Key Area R&D Program of Guangdong Province (Grant No.2022B0701180001)the National Natural Science Foundation of China (Grant No.61801127)+1 种基金the Science Technology Planning Project of Guangdong Province,China (Grant Nos.2019B010140002 and 2020B111110002)the Guangdong-Hong Kong-Macao Joint Innovation Field Project (Grant No.2021A0505080006)。
文摘A novel image encryption scheme based on parallel compressive sensing and edge detection embedding technology is proposed to improve visual security. Firstly, the plain image is sparsely represented using the discrete wavelet transform.Then, the coefficient matrix is scrambled and compressed to obtain a size-reduced image using the Fisher–Yates shuffle and parallel compressive sensing. Subsequently, to increase the security of the proposed algorithm, the compressed image is re-encrypted through permutation and diffusion to obtain a noise-like secret image. Finally, an adaptive embedding method based on edge detection for different carrier images is proposed to generate a visually meaningful cipher image. To improve the plaintext sensitivity of the algorithm, the counter mode is combined with the hash function to generate keys for chaotic systems. Additionally, an effective permutation method is designed to scramble the pixels of the compressed image in the re-encryption stage. The simulation results and analyses demonstrate that the proposed algorithm performs well in terms of visual security and decryption quality.
基金funded by the National Key R&D Program of China(2020YFB1710100)the National Natural Science Foundation of China(Nos.52275337,52090042,51905188).
文摘The intelligent detection technology driven by X-ray images and deep learning represents the forefront of advanced techniques and development trends in flaw detection and automated evaluation of light alloy castings.However,the efficacy of deep learning models hinges upon a substantial abundance of flaw samples.The existing research on X-ray image augmentation for flaw detection suffers from shortcomings such as poor diversity of flaw samples and low reliability of quality evaluation.To this end,a novel approach was put forward,which involves the creation of the Interpolation-Deep Convolutional Generative Adversarial Network(I-DCGAN)for flaw detection image generation and a comprehensive evaluation algorithm named TOPSIS-IFP.I-DCGAN enables the generation of high-resolution,diverse simulated images with multiple appearances,achieving an improvement in sample diversity and quality while maintaining a relatively lower computational complexity.TOPSIS-IFP facilitates multi-dimensional quality evaluation,including aspects such as diversity,authenticity,image distribution difference,and image distortion degree.The results indicate that the X-ray radiographic images of magnesium and aluminum alloy castings achieve optimal performance when trained up to the 800th and 600th epochs,respectively.The TOPSIS-IFP value reaches 78.7%and 73.8%similarity to the ideal solution,respectively.Compared to single index evaluation,the TOPSIS-IFP algorithm achieves higher-quality simulated images at the optimal training epoch.This approach successfully mitigates the issue of unreliable quality associated with single index evaluation.The image generation and comprehensive quality evaluation method developed in this paper provides a novel approach for image augmentation in flaw recognition,holding significant importance for enhancing the robustness of subsequent flaw recognition networks.
基金This research was funded by the Natural Science Foundation of Hebei Province(F2021506004).
文摘Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.
基金The Shanxi Provincial Administration of Traditional Chinese Medicine,No.2023ZYYDA2005.
文摘BACKGROUND Deep learning provides an efficient automatic image recognition method for small bowel(SB)capsule endoscopy(CE)that can assist physicians in diagnosis.However,the existing deep learning models present some unresolved challenges.AIM To propose a novel and effective classification and detection model to automatically identify various SB lesions and their bleeding risks,and label the lesions accurately so as to enhance the diagnostic efficiency of physicians and the ability to identify high-risk bleeding groups.METHODS The proposed model represents a two-stage method that combined image classification with object detection.First,we utilized the improved ResNet-50 classification model to classify endoscopic images into SB lesion images,normal SB mucosa images,and invalid images.Then,the improved YOLO-V5 detection model was utilized to detect the type of lesion and its risk of bleeding,and the location of the lesion was marked.We constructed training and testing sets and compared model-assisted reading with physician reading.RESULTS The accuracy of the model constructed in this study reached 98.96%,which was higher than the accuracy of other systems using only a single module.The sensitivity,specificity,and accuracy of the model-assisted reading detection of all images were 99.17%,99.92%,and 99.86%,which were significantly higher than those of the endoscopists’diagnoses.The image processing time of the model was 48 ms/image,and the image processing time of the physicians was 0.40±0.24 s/image(P<0.001).CONCLUSION The deep learning model of image classification combined with object detection exhibits a satisfactory diagnostic effect on a variety of SB lesions and their bleeding risks in CE images,which enhances the diagnostic efficiency of physicians and improves the ability of physicians to identify high-risk bleeding groups.
文摘With the explosive growth of false information on social media platforms, the automatic detection of multimodalfalse information has received increasing attention. Recent research has significantly contributed to multimodalinformation exchange and fusion, with many methods attempting to integrate unimodal features to generatemultimodal news representations. However, they still need to fully explore the hierarchical and complex semanticcorrelations between different modal contents, severely limiting their performance detecting multimodal falseinformation. This work proposes a two-stage detection framework for multimodal false information detection,called ASMFD, which is based on image aesthetic similarity to segment and explores the consistency andinconsistency features of images and texts. Specifically, we first use the Contrastive Language-Image Pre-training(CLIP) model to learn the relationship between text and images through label awareness and train an imageaesthetic attribute scorer using an aesthetic attribute dataset. Then, we calculate the aesthetic similarity betweenthe image and related images and use this similarity as a threshold to divide the multimodal correlation matrixinto consistency and inconsistencymatrices. Finally, the fusionmodule is designed to identify essential features fordetectingmultimodal false information. In extensive experiments on four datasets, the performance of the ASMFDis superior to state-of-the-art baseline methods.
文摘This paper addresses the common orthopedic trauma of spinal vertebral fractures and aims to enhance doctors’diagnostic efficiency.Therefore,a deep-learning-based automated diagnostic systemwithmulti-label segmentation is proposed to recognize the condition of vertebral fractures.The whole spine Computed Tomography(CT)image is segmented into the fracture,normal,and background using U-Net,and the fracture degree of each vertebra is evaluated(Genant semi-qualitative evaluation).The main work of this paper includes:First,based on the spatial configuration network(SCN)structure,U-Net is used instead of the SCN feature extraction network.The attention mechanismandthe residual connectionbetweenthe convolutional layers are added in the local network(LN)stage.Multiple filtering is added in the global network(GN)stage,and each layer of the LN decoder feature map is filtered separately using dot product,and the filtered features are re-convolved to obtain the GN output heatmap.Second,a network model with improved SCN(M-SCN)helps automatically localize the center-of-mass position of each vertebra,and the voxels around each localized vertebra were clipped,eliminating a large amount of redundant information(e.g.,background and other interfering vertebrae)and keeping the vertebrae to be segmented in the center of the image.Multilabel segmentation of the clipped portion was subsequently performed using U-Net.This paper uses VerSe’19,VerSe’20(using only data containing vertebral fractures),and private data(provided by Guizhou Orthopedic Hospital)for model training and evaluation.Compared with the original SCN network,the M-SCN reduced the prediction error rate by 1.09%and demonstrated the effectiveness of the improvement in ablation experiments.In the vertebral segmentation experiment,the Dice Similarity Coefficient(DSC)index reached 93.50%and the Maximum Symmetry Surface Distance(MSSD)index was 4.962 mm,with accuracy and recall of 95.82%and 91.73%,respectively.Fractured vertebrae were also marked as red and normal vertebrae were marked as white in the experiment,and the semi-qualitative assessment results of Genant were provided,as well as the results of spinal localization visualization and 3D reconstructed views of the spine to analyze the actual predictive ability of the model.It provides a promising tool for vertebral fracture detection.
基金the National Natural Science Foundation of China(62003298,62163036)the Major Project of Science and Technology of Yunnan Province(202202AD080005,202202AH080009)the Yunnan University Professional Degree Graduate Practice Innovation Fund Project(ZC-22222770)。
文摘Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have been proposed,most of them can only address part of the practical difficulties.An oscillation is heuristically defined as a visually apparent periodic variation.However,manual visual inspection is labor-intensive and prone to missed detection.Convolutional neural networks(CNNs),inspired by animal visual systems,have been raised with powerful feature extraction capabilities.In this work,an exploration of the typical CNN models for visual oscillation detection is performed.Specifically,we tested MobileNet-V1,ShuffleNet-V2,Efficient Net-B0,and GhostNet models,and found that such a visual framework is well-suited for oscillation detection.The feasibility and validity of this framework are verified utilizing extensive numerical and industrial cases.Compared with state-of-theart oscillation detectors,the suggested framework is more straightforward and more robust to noise and mean-nonstationarity.In addition,this framework generalizes well and is capable of handling features that are not present in the training data,such as multiple oscillations and outliers.
基金funded by Hunan Provincial Natural Science Foundation of China with Grant Numbers(2022JJ50016,2023JJ50096)Innovation Platform Open Fund of Hengyang Normal University Grant 2021HSKFJJ039Hengyang Science and Technology Plan Guiding Project with Number 202222025902.
文摘In rice production,the prevention and management of pests and diseases have always received special attention.Traditional methods require human experts,which is costly and time-consuming.Due to the complexity of the structure of rice diseases and pests,quickly and reliably recognizing and locating them is difficult.Recently,deep learning technology has been employed to detect and identify rice diseases and pests.This paper introduces common publicly available datasets;summarizes the applications on rice diseases and pests from the aspects of image recognition,object detection,image segmentation,attention mechanism,and few-shot learning methods according to the network structure differences;and compares the performances of existing studies.Finally,the current issues and challenges are explored fromthe perspective of data acquisition,data processing,and application,providing possible solutions and suggestions.This study aims to review various DL models and provide improved insight into DL techniques and their cutting-edge progress in the prevention and management of rice diseases and pests.
基金support by the National Natural Science Foundation of China (Grant No. 62005049)Natural Science Foundation of Fujian Province (Grant Nos. 2020J01451, 2022J05113)Education and Scientific Research Program for Young and Middleaged Teachers in Fujian Province (Grant No. JAT210035)。
文摘Camouflaged people are extremely expert in actively concealing themselves by effectively utilizing cover and the surrounding environment. Despite advancements in optical detection capabilities through imaging systems, including spectral, polarization, and infrared technologies, there is still a lack of effective real-time method for accurately detecting small-size and high-efficient camouflaged people in complex real-world scenes. Here, this study proposes a snapshot multispectral image-based camouflaged detection model, multispectral YOLO(MS-YOLO), which utilizes the SPD-Conv and Sim AM modules to effectively represent targets and suppress background interference by exploiting the spatial-spectral target information. Besides, the study constructs the first real-shot multispectral camouflaged people dataset(MSCPD), which encompasses diverse scenes, target scales, and attitudes. To minimize information redundancy, MS-YOLO selects an optimal subset of 12 bands with strong feature representation and minimal inter-band correlation as input. Through experiments on the MSCPD, MS-YOLO achieves a mean Average Precision of 94.31% and real-time detection at 65 frames per second, which confirms the effectiveness and efficiency of our method in detecting camouflaged people in various typical desert and forest scenes. Our approach offers valuable support to improve the perception capabilities of unmanned aerial vehicles in detecting enemy forces and rescuing personnel in battlefield.
文摘Artificial intelligence(AI)is making significant strides in revolutionizing the detection of Barrett's esophagus(BE),a precursor to esophageal adenocarcinoma.In the research article by Tsai et al,researchers utilized endoscopic images to train an AI model,challenging the traditional distinction between endoscopic and histological BE.This approach yielded remarkable results,with the AI system achieving an accuracy of 94.37%,sensitivity of 94.29%,and specificity of 94.44%.The study's extensive dataset enhances the AI model's practicality,offering valuable support to endoscopists by minimizing unnecessary biopsies.However,questions about the applicability to different endoscopic systems remain.The study underscores the potential of AI in BE detection while highlighting the need for further research to assess its adaptability to diverse clinical settings.
基金the Liaoning Provincial Department of Education 2021 Annual Scientific Research Funding Program(Grant Numbers LJKZ0535,LJKZ0526)the 2021 Annual Comprehensive Reform of Undergraduate Education Teaching(Grant Numbers JGLX2021020,JCLX2021008)Graduate Innovation Fund of Dalian Polytechnic University(Grant Number 2023CXYJ13).
文摘In pursuit of cost-effective manufacturing,enterprises are increasingly adopting the practice of utilizing recycled semiconductor chips.To ensure consistent chip orientation during packaging,a circular marker on the front side is employed for pin alignment following successful functional testing.However,recycled chips often exhibit substantial surface wear,and the identification of the relatively small marker proves challenging.Moreover,the complexity of generic target detection algorithms hampers seamless deployment.Addressing these issues,this paper introduces a lightweight YOLOv8s-based network tailored for detecting markings on recycled chips,termed Van-YOLOv8.Initially,to alleviate the influence of diminutive,low-resolution markings on the precision of deep learning models,we utilize an upscaling approach for enhanced resolution.This technique relies on the Super-Resolution Generative Adversarial Network with Extended Training(SRGANext)network,facilitating the reconstruction of high-fidelity images that align with input specifications.Subsequently,we replace the original YOLOv8smodel’s backbone feature extraction network with the lightweight VanillaNetwork(VanillaNet),simplifying the branch structure to reduce network parameters.Finally,a Hybrid Attention Mechanism(HAM)is implemented to capture essential details from input images,improving feature representation while concurrently expediting model inference speed.Experimental results demonstrate that the Van-YOLOv8 network outperforms the original YOLOv8s on a recycled chip dataset in various aspects.Significantly,it demonstrates superiority in parameter count,computational intricacy,precision in identifying targets,and speed when compared to certain prevalent algorithms in the current landscape.The proposed approach proves promising for real-time detection of recycled chips in practical factory settings.
基金This work was supported by the National Natural Science Foundation of China(Grant No.U20A20197).
文摘We propose a novel image segmentation algorithm to tackle the challenge of limited recognition and segmentation performance in identifying welding seam images during robotic intelligent operations.Initially,to enhance the capability of deep neural networks in extracting geometric attributes from depth images,we developed a novel deep geometric convolution operator(DGConv).DGConv is utilized to construct a deep local geometric feature extraction module,facilitating a more comprehensive exploration of the intrinsic geometric information within depth images.Secondly,we integrate the newly proposed deep geometric feature module with the Fully Convolutional Network(FCN8)to establish a high-performance deep neural network algorithm tailored for depth image segmentation.Concurrently,we enhance the FCN8 detection head by separating the segmentation and classification processes.This enhancement significantly boosts the network’s overall detection capability.Thirdly,for a comprehensive assessment of our proposed algorithm and its applicability in real-world industrial settings,we curated a line-scan image dataset featuring weld seams.This dataset,named the Standardized Linear Depth Profile(SLDP)dataset,was collected from actual industrial sites where autonomous robots are in operation.Ultimately,we conducted experiments utilizing the SLDP dataset,achieving an average accuracy of 92.7%.Our proposed approach exhibited a remarkable performance improvement over the prior method on the identical dataset.Moreover,we have successfully deployed the proposed algorithm in genuine industrial environments,fulfilling the prerequisites of unmanned robot operations.
基金Researchers Supporting Project Number(RSPD2024R576),King Saud University,Riyadh,Saudi Arabia。
文摘Cardiovascular disease is the leading cause of death globally.This disease causes loss of heart muscles and is also responsible for the death of heart cells,sometimes damaging their functionality.A person’s life may depend on receiving timely assistance as soon as possible.Thus,minimizing the death ratio can be achieved by early detection of heart attack(HA)symptoms.In the United States alone,an estimated 610,000 people die fromheart attacks each year,accounting for one in every four fatalities.However,by identifying and reporting heart attack symptoms early on,it is possible to reduce damage and save many lives significantly.Our objective is to devise an algorithm aimed at helping individuals,particularly elderly individuals living independently,to safeguard their lives.To address these challenges,we employ deep learning techniques.We have utilized a vision transformer(ViT)to address this problem.However,it has a significant overhead cost due to its memory consumption and computational complexity because of scaling dot-product attention.Also,since transformer performance typically relies on large-scale or adequate data,adapting ViT for smaller datasets is more challenging.In response,we propose a three-in-one steam model,theMulti-Head Attention Vision Hybrid(MHAVH).Thismodel integrates a real-time posture recognition framework to identify chest pain postures indicative of heart attacks using transfer learning techniques,such as ResNet-50 and VGG-16,renowned for their robust feature extraction capabilities.By incorporatingmultiple heads into the vision transformer to generate additional metrics and enhance heart-detection capabilities,we leverage a 2019 posture-based dataset comprising RGB images,a novel creation by the author that marks the first dataset tailored for posture-based heart attack detection.Given the limited online data availability,we segmented this dataset into gender categories(male and female)and conducted testing on both segmented and original datasets.The training accuracy of our model reached an impressive 99.77%.Upon testing,the accuracy for male and female datasets was recorded at 92.87%and 75.47%,respectively.The combined dataset accuracy is 93.96%,showcasing a commendable performance overall.Our proposed approach demonstrates versatility in accommodating small and large datasets,offering promising prospects for real-world applications.