Multimodal lung tumor medical images can provide anatomical and functional information for the same lesion.Such as Positron Emission Computed Tomography(PET),Computed Tomography(CT),and PET-CT.How to utilize the lesio...Multimodal lung tumor medical images can provide anatomical and functional information for the same lesion.Such as Positron Emission Computed Tomography(PET),Computed Tomography(CT),and PET-CT.How to utilize the lesion anatomical and functional information effectively and improve the network segmentation performance are key questions.To solve the problem,the Saliency Feature-Guided Interactive Feature Enhancement Lung Tumor Segmentation Network(Guide-YNet)is proposed in this paper.Firstly,a double-encoder single-decoder U-Net is used as the backbone in this model,a single-coder single-decoder U-Net is used to generate the saliency guided feature using PET image and transmit it into the skip connection of the backbone,and the high sensitivity of PET images to tumors is used to guide the network to accurately locate lesions.Secondly,a Cross Scale Feature Enhancement Module(CSFEM)is designed to extract multi-scale fusion features after downsampling.Thirdly,a Cross-Layer Interactive Feature Enhancement Module(CIFEM)is designed in the encoder to enhance the spatial position information and semantic information.Finally,a Cross-Dimension Cross-Layer Feature Enhancement Module(CCFEM)is proposed in the decoder,which effectively extractsmultimodal image features through global attention and multi-dimension local attention.The proposed method is verified on the lung multimodal medical image datasets,and the results showthat theMean Intersection overUnion(MIoU),Accuracy(Acc),Dice Similarity Coefficient(Dice),Volumetric overlap error(Voe),Relative volume difference(Rvd)of the proposed method on lung lesion segmentation are 87.27%,93.08%,97.77%,95.92%,89.28%,and 88.68%,respectively.It is of great significance for computer-aided diagnosis.展开更多
With the development of social media and the prevalence of mobile devices,an increasing number of people tend to use social media platforms to express their opinions and attitudes,leading to many online controversies....With the development of social media and the prevalence of mobile devices,an increasing number of people tend to use social media platforms to express their opinions and attitudes,leading to many online controversies.These online controversies can severely threaten social stability,making automatic detection of controversies particularly necessary.Most controversy detection methods currently focus on mining features from text semantics and propagation structures.However,these methods have two drawbacks:1)limited ability to capture structural features and failure to learn deeper structural features,and 2)neglecting the influence of topic information and ineffective utilization of topic features.In light of these phenomena,this paper proposes a social media controversy detection method called Dual Feature Enhanced Graph Convolutional Network(DFE-GCN).This method explores structural information at different scales from global and local perspectives to capture deeper structural features,enhancing the expressive power of structural features.Furthermore,to strengthen the influence of topic information,this paper utilizes attention mechanisms to enhance topic features after each graph convolutional layer,effectively using topic information.We validated our method on two different public datasets,and the experimental results demonstrate that our method achieves state-of-the-art performance compared to baseline methods.On the Weibo and Reddit datasets,the accuracy is improved by 5.92%and 3.32%,respectively,and the F1 score is improved by 1.99%and 2.17%,demonstrating the positive impact of enhanced structural features and topic features on controversy detection.展开更多
In thefield of diagnosis of medical images the challenge lies in tracking and identifying the defective cells and the extent of the defective region within the complex structure of a brain cavity.Locating the defective...In thefield of diagnosis of medical images the challenge lies in tracking and identifying the defective cells and the extent of the defective region within the complex structure of a brain cavity.Locating the defective cells precisely during the diagnosis phase helps tofight the greatest exterminator of mankind.Early detec-tion of these defective cells requires an accurate computer-aided diagnostic system(CAD)that supports early treatment and promotes survival rates of patients.An ear-lier version of CAD systems relies greatly on the expertise of radiologist and it con-sumed more time to identify the defective region.The manuscript takes the efficacy of coalescing features like intensity,shape,and texture of the magnetic resonance image(MRI).In the Enhanced Feature Fusion Segmentation based classification method(EEFS)the image is enhanced and segmented to extract the prominent fea-tures.To bring out the desired effect the EEFS method uses Enhanced Local Binary Pattern(EnLBP),Partisan Gray Level Co-occurrence Matrix Histogram of Oriented Gradients(PGLCMHOG),and iGrab cut method to segment image.These prominent features along with deep features are coalesced to provide a single-dimensional fea-ture vector that is effectively used for prediction.The coalesced vector is used with the existing classifiers to compare the results of these classifiers with that of the gen-erated vector.The generated vector provides promising results with commendably less computatio nal time for pre-processing and classification of MR medical images.展开更多
The detection of brain disease is an essential issue in medical and research areas.Deep learning techniques have shown promising results in detecting and diagnosing brain diseases using magnetic resonance imaging(MRI)...The detection of brain disease is an essential issue in medical and research areas.Deep learning techniques have shown promising results in detecting and diagnosing brain diseases using magnetic resonance imaging(MRI)images.These techniques involve training neural networks on large datasets of MRI images,allowing the networks to learn patterns and features indicative of different brain diseases.However,several challenges and limitations still need to be addressed further to improve the accuracy and effectiveness of these techniques.This paper implements a Feature Enhanced Stacked Auto Encoder(FESAE)model to detect brain diseases.The standard stack auto encoder’s results are trivial and not robust enough to boost the system’s accuracy.Therefore,the standard Stack Auto Encoder(SAE)is replaced with a Stacked Feature Enhanced Auto Encoder with a feature enhancement function to efficiently and effectively get non-trivial features with less activation energy froman image.The proposed model consists of four stages.First,pre-processing is performed to remove noise,and the greyscale image is converted to Red,Green,and Blue(RGB)to enhance feature details for discriminative feature extraction.Second,feature Extraction is performed to extract significant features for classification using DiscreteWavelet Transform(DWT)and Channelization.Third,classification is performed to classify MRI images into four major classes:Normal,Tumor,Brain Stroke,and Alzheimer’s.Finally,the FESAE model outperforms the state-of-theart,machine learning,and deep learning methods such as Artificial Neural Network(ANN),SAE,Random Forest(RF),and Logistic Regression(LR)by achieving a high accuracy of 98.61% on a dataset of 2000 MRI images.The proposed model has significant potential for assisting radiologists in diagnosing brain diseases more accurately and improving patient outcomes.展开更多
To realize high-precision automatic measurement of two-dimensional geometric features on parts, a cooperative measurement system based on machine vision is constructed. Its hardware structure, functional composition a...To realize high-precision automatic measurement of two-dimensional geometric features on parts, a cooperative measurement system based on machine vision is constructed. Its hardware structure, functional composition and working principle are introduced. The mapping relationship between the feature image coordinates and the measuring space coordinates is established. The method of measuring path planning of small field of view (FOV) images is proposed. With the cooperation of the panoramic image of the object to be measured, the small FOV images with high object plane resolution are acquired automatically. Then, the auxiliary measuring characteristics are constructed and the parameters of the features to be measured are automatically extracted. Experimental results show that the absolute value of relative error is less than 0. 03% when applying the cooperative measurement system to gauge the hole distance of 100 mm nominal size. When the object plane resolving power of the small FOV images is 16 times that of the large FOV image, the measurement accuracy of small FOV images is improved by 14 times compared with the large FOV image. It is suitable for high-precision automatic measurement of two-dimensional complex geometric features distributed on large scale parts.展开更多
While single-modal visible light images or infrared images provide limited information,infrared light captures significant thermal radiation data,whereas visible light excels in presenting detailed texture information...While single-modal visible light images or infrared images provide limited information,infrared light captures significant thermal radiation data,whereas visible light excels in presenting detailed texture information.Com-bining images obtained from both modalities allows for leveraging their respective strengths and mitigating individual limitations,resulting in high-quality images with enhanced contrast and rich texture details.Such capabilities hold promising applications in advanced visual tasks including target detection,instance segmentation,military surveillance,pedestrian detection,among others.This paper introduces a novel approach,a dual-branch decomposition fusion network based on AutoEncoder(AE),which decomposes multi-modal features into intensity and texture information for enhanced fusion.Local contrast enhancement module(CEM)and texture detail enhancement module(DEM)are devised to process the decomposed images,followed by image fusion through the decoder.The proposed loss function ensures effective retention of key information from the source images of both modalities.Extensive comparisons and generalization experiments demonstrate the superior performance of our network in preserving pixel intensity distribution and retaining texture details.From the qualitative results,we can see the advantages of fusion details and local contrast.In the quantitative experiments,entropy(EN),mutual information(MI),structural similarity(SSIM)and other results have improved and exceeded the SOTA(State of the Art)model as a whole.展开更多
In the era of the Internet,widely used web applications have become the target of hacker attacks because they contain a large amount of personal information.Among these vulnerabilities,stealing private data through cr...In the era of the Internet,widely used web applications have become the target of hacker attacks because they contain a large amount of personal information.Among these vulnerabilities,stealing private data through crosssite scripting(XSS)attacks is one of the most commonly used attacks by hackers.Currently,deep learning-based XSS attack detection methods have good application prospects;however,they suffer from problems such as being prone to overfitting,a high false alarm rate,and low accuracy.To address these issues,we propose a multi-stage feature extraction and fusion model for XSS detection based on Random Forest feature enhancement.The model utilizes RandomForests to capture the intrinsic structure and patterns of the data by extracting leaf node indices as features,which are subsequentlymergedwith the original data features to forma feature setwith richer information content.Further feature extraction is conducted through three parallel channels.Channel I utilizes parallel onedimensional convolutional layers(1Dconvolutional layers)with different convolutional kernel sizes to extract local features at different scales and performmulti-scale feature fusion;Channel II employsmaximum one-dimensional pooling layers(max 1D pooling layers)of various sizes to extract key features from the data;and Channel III extracts global information bi-directionally using a Bi-Directional Long-Short TermMemory Network(Bi-LSTM)and incorporates a multi-head attention mechanism to enhance global features.Finally,effective classification and prediction of XSS are performed by fusing the features of the three channels.To test the effectiveness of the model,we conduct experiments on six datasets.We achieve an accuracy of 100%on the UNSW-NB15 dataset and 99.99%on the CICIDS2017 dataset,which is higher than that of the existing models.展开更多
Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain les...Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors.展开更多
Light–matter interactions in two-dimensional(2D)materials have been the focus of research since the discovery of graphene.The light–matter interaction length in 2D materials is,however,much shorter than that in bulk...Light–matter interactions in two-dimensional(2D)materials have been the focus of research since the discovery of graphene.The light–matter interaction length in 2D materials is,however,much shorter than that in bulk materials owing to the atomic nature of 2D materials.Plasmonic nanostructures are usually integrated with 2D materials to enhance the light–matter interactions,offering great opportunities for both fundamental research and technological applications.Nanoparticle-on-mirror(NPo M)structures with extremely confined optical fields are highly desired in this aspect.In addition,2D materials provide a good platform for the study of plasmonic fields with subnanometer resolution and quantum plasmonics down to the characteristic length scale of a single atom.A focused and up-to-date review article is highly desired for a timely summary of the progress in this rapidly growing field and to encourage more research efforts in this direction.In this review,we will first introduce the basic concepts of plasmonic modes in NPo M structures.Interactions between plasmons and quasi-particles in 2D materials,e.g.,excitons and phonons,from weak to strong coupling and potential applications will then be described in detail.Related phenomena in subnanometer metallic gaps separated by 2D materials,such as quantum tunneling,will also be touched.We will finally discuss phenomena and physical processes that have not been understood clearly and provide an outlook for future research.We believe that the hybrid systems of2D materials and NPo M structures will be a promising research field in the future.展开更多
Manhole cover defect recognition is of significant practical importance as it can accurately identify damaged or missing covers, enabling timely replacement and maintenance. Traditional manhole cover detection techniq...Manhole cover defect recognition is of significant practical importance as it can accurately identify damaged or missing covers, enabling timely replacement and maintenance. Traditional manhole cover detection techniques primarily focus on detecting the presence of covers rather than classifying the types of defects. However, manhole cover defects exhibit small inter-class feature differences and large intra-class feature variations, which makes their recognition challenging. To improve the classification of manhole cover defect types, we propose a Progressive Dual-Branch Feature Fusion Network (PDBFFN). The baseline backbone network adopts a multi-stage hierarchical architecture design using Res-Net50 as the visual feature extractor, from which both local and global information is obtained. Additionally, a Feature Enhancement Module (FEM) and a Fusion Module (FM) are introduced to enhance the network’s ability to learn critical features. Experimental results demonstrate that our model achieves a classification accuracy of 82.6% on a manhole cover defect dataset, outperforming several state-of-the-art fine-grained image classification models.展开更多
Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providi...Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providing an important decision-making function for sustainable transportation.In order to provide a comprehensive and reasonable description of complex traffic scenes,a traffic scene semantic captioningmodel withmulti-stage feature enhancement is proposed in this paper.In general,the model follows an encoder-decoder structure.First,multilevel granularity visual features are used for feature enhancement during the encoding process,which enables the model to learn more detailed content in the traffic scene image.Second,the scene knowledge graph is applied to the decoding process,and the semantic features provided by the scene knowledge graph are used to enhance the features learned by the decoder again,so that themodel can learn the attributes of objects in the traffic scene and the relationships between objects to generate more reasonable captions.This paper reports extensive experiments on the challenging MS-COCO dataset,evaluated by five standard automatic evaluation metrics,and the results show that the proposed model has improved significantly in all metrics compared with the state-of-the-art methods,especially achieving a score of 129.0 on the CIDEr-D evaluation metric,which also indicates that the proposed model can effectively provide a more reasonable and comprehensive description of the traffic scene.展开更多
At present,knowledge embedding methods are widely used in the field of knowledge graph(KG)reasoning,and have been successfully applied to those with large entities and relationships.However,in research and production ...At present,knowledge embedding methods are widely used in the field of knowledge graph(KG)reasoning,and have been successfully applied to those with large entities and relationships.However,in research and production environments,there are a large number of KGs with a small number of entities and relations,which are called sparse KGs.Limited by the performance of knowledge extraction methods or some other reasons(some common-sense information does not appear in the natural corpus),the relation between entities is often incomplete.To solve this problem,a method of the graph neural network and information enhancement is proposed.The improved method increases the mean reciprocal rank(MRR)and Hit@3 by 1.6%and 1.7%,respectively,when the sparsity of the FB15K-237 dataset is 10%.When the sparsity is 50%,the evaluation indexes MRR and Hit@10 are increased by 0.8%and 1.8%,respectively.展开更多
Proposed system has been developed to extract the optimal features from the breast tumors using Enhanced Cuckoo Search (ECS) and presented in this paper. The texture feature, intensity histogram feature, radial distan...Proposed system has been developed to extract the optimal features from the breast tumors using Enhanced Cuckoo Search (ECS) and presented in this paper. The texture feature, intensity histogram feature, radial distance feature and shape features have been extracted and the optimal feature set has been obtained using ECS. The overall accuracy of a minimum distance classifier and k-Nearest Neighbor (k-NN) on validation samples is used as a fitness value for ECS. The new approach is carried out on the extracted feature dataset. The proposed system selects only the minimum number of features and performed the accuracy of 98.75% with Minimum Distance Classifier and 99.13% with k-NN Classifier. The performance of the new ECS is compared with the Cuckoo Search and Harmony Search. This result shows that the ECS algorithm is more accurate than the other algorithm. The proposed system can provide valuable information to the physician in medical pathology.展开更多
A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize...A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize the enhanced whisper. A novel noise robust feature called Gammatone feature cosine coefficients (GFCCs) extracted by an auditory periphery model is derived and used for the binary mask estimation. The intelligibility performance of the proposed method is evaluated and compared with the traditional speech enhancement methods. Objective and subjective evaluation results indicate that the proposed method can effectively improve the intelligibility of whispered speech which is contaminated by noise. Compared with the power subtract algorithm and the log-MMSE algorithm, both of which do not improve the intelligibility in lower signal-to-noise ratio (SNR) environments, the proposed method has good performance in improving the intelligibility of noisy whisper. Additionally, the intelligibility of the enhanced whispered speech using the proposed method also outperforms that of the corresponding unprocessed noisy whispered speech.展开更多
The recent development of cardiac magnetic resonance(CMR)techniques has allowed detailed analyses of cardiac function and tissue characterization with high spatial resolution.We review characteristic CMR features in i...The recent development of cardiac magnetic resonance(CMR)techniques has allowed detailed analyses of cardiac function and tissue characterization with high spatial resolution.We review characteristic CMR features in ischemic and non-ischemic cardiomyopathies(ICM and NICM),especially in terms of the location and distribution of late gadolinium enhancement(LGE).CMR in ICM shows segmental wall motion abnormalities or wall thinning in a particular coronary arterial territory,and the subendocardial or transmural LGE.LGE in NICM generally does not correspond to any particular coronary artery distribution and is located mostly in the mid-wall to subepicardial layer.The analysis of LGE distribution is valuable to differentiate NICM with diffusely impaired systolic function,including dilated cardiomyopathy,end-stage hypertrophic cardiomyopathy(HCM),cardiac sarcoidosis,and myocarditis,and those with diffuse left ventricular(LV)hypertrophy including HCM,cardiac amyloidosis and Anderson-Fabry disease.A transient low signal intensity LGE in regions of severe LV dysfunction is a particular feature of stress cardiomyopathy.In arrhythmogenic right ventricular cardiomyopathy/dysplasia,an enhancement of right ventricular(RV)wall with functional and morphological changes of RV becomes apparent.Finally,the analyses of LGE distribution have potentials to predict cardiac outcomes and response to treatments.展开更多
Diagnosing gastrointestinal cancer by classical means is a hazardous procedure.Years have witnessed several computerized solutions for stomach disease detection and classification.However,the existing techniques faced...Diagnosing gastrointestinal cancer by classical means is a hazardous procedure.Years have witnessed several computerized solutions for stomach disease detection and classification.However,the existing techniques faced challenges,such as irrelevant feature extraction,high similarity among different disease symptoms,and the least-important features from a single source.This paper designed a new deep learning-based architecture based on the fusion of two models,Residual blocks and Auto Encoder.First,the Hyper-Kvasir dataset was employed to evaluate the proposed work.The research selected a pre-trained convolutional neural network(CNN)model and improved it with several residual blocks.This process aims to improve the learning capability of deep models and lessen the number of parameters.Besides,this article designed an Auto-Encoder-based network consisting of five convolutional layers in the encoder stage and five in the decoder phase.The research selected the global average pooling and convolutional layers for the feature extraction optimized by a hybrid Marine Predator optimization and Slime Mould optimization algorithm.These features of both models are fused using a novel fusion technique that is later classified using the Artificial Neural Network classifier.The experiment worked on the HyperKvasir dataset,which consists of 23 stomach-infected classes.At last,the proposed method obtained an improved accuracy of 93.90%on this dataset.Comparison is also conducted with some recent techniques and shows that the proposed method’s accuracy is improved.展开更多
Manual diagnosis of brain tumors usingmagnetic resonance images(MRI)is a hectic process and time-consuming.Also,it always requires an expert person for the diagnosis.Therefore,many computer-controlled methods for diag...Manual diagnosis of brain tumors usingmagnetic resonance images(MRI)is a hectic process and time-consuming.Also,it always requires an expert person for the diagnosis.Therefore,many computer-controlled methods for diagnosing and classifying brain tumors have been introduced in the literature.This paper proposes a novel multimodal brain tumor classification framework based on two-way deep learning feature extraction and a hybrid feature optimization algorithm.NasNet-Mobile,a pre-trained deep learning model,has been fine-tuned and twoway trained on original and enhancedMRI images.The haze-convolutional neural network(haze-CNN)approach is developed and employed on the original images for contrast enhancement.Next,transfer learning(TL)is utilized for training two-way fine-tuned models and extracting feature vectors from the global average pooling layer.Then,using a multiset canonical correlation analysis(CCA)method,features of both deep learning models are fused into a single feature matrix—this technique aims to enhance the information in terms of features for better classification.Although the information was increased,computational time also jumped.This issue is resolved using a hybrid feature optimization algorithm that chooses the best classification features.The experiments were done on two publicly available datasets—BraTs2018 and BraTs2019—and yielded accuracy rates of 94.8%and 95.7%,respectively.The proposedmethod is comparedwith several recent studies andoutperformed inaccuracy.In addition,we analyze the performance of each middle step of the proposed approach and find the selection technique strengthens the proposed framework.展开更多
On the basis of second-order perturbation approximate and modal expansion approach,we investigate the enhancement effect of cumulative second-harmonic generation(SHG)of circumferential guided waves(CGWs)in a circular ...On the basis of second-order perturbation approximate and modal expansion approach,we investigate the enhancement effect of cumulative second-harmonic generation(SHG)of circumferential guided waves(CGWs)in a circular tube,which is inherently induced by the closed propagation feature of CGWs.An appropriate mode pair of primary-and double-frequency CGWs satisfying the phase velocity matching and nonzero energy flux is selected to ensure that the second harmonic generated by primary CGW propagation can accumulate along the circumference.Using a coherent superposition of multi-waves,a model of unidirectional CGW propagation is established for analyzing the enhancement effect of cumulative SHG of primary CGW mode selected.The theoretical analyses and numerical simulations performed directly demonstrate that the second harmonic generated does have a cumulative effect along the circumferential direction and the closed propagation feature of CGWs does enhance the magnitude of cumulative second harmonic generated.Potential applications of the enhancement effect of cumulative SHG of CGWs are considered and discussed.The theoretical analysis and numerical simulation perspective presented here yield an insight previously unavailable into the physical mechanism of the enhancement effect of cumulative SHG by closed propagation feature of CGWs in a circular tube.展开更多
Sea cucumber detection is widely recognized as the key to automatic culture.The underwater light environment is complex and easily obscured by mud,sand,reefs,and other underwater organisms.To date,research on sea cucu...Sea cucumber detection is widely recognized as the key to automatic culture.The underwater light environment is complex and easily obscured by mud,sand,reefs,and other underwater organisms.To date,research on sea cucumber detection has mostly concentrated on the distinction between prospective objects and the background.However,the key to proper distinction is the effective extraction of sea cucumber feature information.In this study,the edge-enhanced scaling You Only Look Once-v4(YOLOv4)(ESYv4)was proposed for sea cucumber detection.By emphasizing the target features in a way that reduced the impact of different hues and brightness values underwater on the misjudgment of sea cucumbers,a bidirectional cascade network(BDCN)was used to extract the overall edge greyscale image in the image and add up the original RGB image as the detected input.Meanwhile,the YOLOv4 model for backbone detection is scaled,and the number of parameters is reduced to 48%of the original number of parameters.Validation results of 783images indicated that the detection precision of positive sea cucumber samples reached 0.941.This improvement reflects that the algorithm is more effective to improve the edge feature information of the target.It thus contributes to the automatic multi-objective detection of underwater sea cucumbers.展开更多
In pursuit of cost-effective manufacturing,enterprises are increasingly adopting the practice of utilizing recycled semiconductor chips.To ensure consistent chip orientation during packaging,a circular marker on the f...In pursuit of cost-effective manufacturing,enterprises are increasingly adopting the practice of utilizing recycled semiconductor chips.To ensure consistent chip orientation during packaging,a circular marker on the front side is employed for pin alignment following successful functional testing.However,recycled chips often exhibit substantial surface wear,and the identification of the relatively small marker proves challenging.Moreover,the complexity of generic target detection algorithms hampers seamless deployment.Addressing these issues,this paper introduces a lightweight YOLOv8s-based network tailored for detecting markings on recycled chips,termed Van-YOLOv8.Initially,to alleviate the influence of diminutive,low-resolution markings on the precision of deep learning models,we utilize an upscaling approach for enhanced resolution.This technique relies on the Super-Resolution Generative Adversarial Network with Extended Training(SRGANext)network,facilitating the reconstruction of high-fidelity images that align with input specifications.Subsequently,we replace the original YOLOv8smodel’s backbone feature extraction network with the lightweight VanillaNetwork(VanillaNet),simplifying the branch structure to reduce network parameters.Finally,a Hybrid Attention Mechanism(HAM)is implemented to capture essential details from input images,improving feature representation while concurrently expediting model inference speed.Experimental results demonstrate that the Van-YOLOv8 network outperforms the original YOLOv8s on a recycled chip dataset in various aspects.Significantly,it demonstrates superiority in parameter count,computational intricacy,precision in identifying targets,and speed when compared to certain prevalent algorithms in the current landscape.The proposed approach proves promising for real-time detection of recycled chips in practical factory settings.展开更多
基金supported in part by the National Natural Science Foundation of China(Grant No.62062003)Natural Science Foundation of Ningxia(Grant No.2023AAC03293).
文摘Multimodal lung tumor medical images can provide anatomical and functional information for the same lesion.Such as Positron Emission Computed Tomography(PET),Computed Tomography(CT),and PET-CT.How to utilize the lesion anatomical and functional information effectively and improve the network segmentation performance are key questions.To solve the problem,the Saliency Feature-Guided Interactive Feature Enhancement Lung Tumor Segmentation Network(Guide-YNet)is proposed in this paper.Firstly,a double-encoder single-decoder U-Net is used as the backbone in this model,a single-coder single-decoder U-Net is used to generate the saliency guided feature using PET image and transmit it into the skip connection of the backbone,and the high sensitivity of PET images to tumors is used to guide the network to accurately locate lesions.Secondly,a Cross Scale Feature Enhancement Module(CSFEM)is designed to extract multi-scale fusion features after downsampling.Thirdly,a Cross-Layer Interactive Feature Enhancement Module(CIFEM)is designed in the encoder to enhance the spatial position information and semantic information.Finally,a Cross-Dimension Cross-Layer Feature Enhancement Module(CCFEM)is proposed in the decoder,which effectively extractsmultimodal image features through global attention and multi-dimension local attention.The proposed method is verified on the lung multimodal medical image datasets,and the results showthat theMean Intersection overUnion(MIoU),Accuracy(Acc),Dice Similarity Coefficient(Dice),Volumetric overlap error(Voe),Relative volume difference(Rvd)of the proposed method on lung lesion segmentation are 87.27%,93.08%,97.77%,95.92%,89.28%,and 88.68%,respectively.It is of great significance for computer-aided diagnosis.
基金funded by the Natural Science Foundation of China Grant No.202204120017the Autonomous Region Science and Technology Program Grant No.2022B01008-2the Autonomous Region Science and Technology Program Grant No.2020A02001-1.
文摘With the development of social media and the prevalence of mobile devices,an increasing number of people tend to use social media platforms to express their opinions and attitudes,leading to many online controversies.These online controversies can severely threaten social stability,making automatic detection of controversies particularly necessary.Most controversy detection methods currently focus on mining features from text semantics and propagation structures.However,these methods have two drawbacks:1)limited ability to capture structural features and failure to learn deeper structural features,and 2)neglecting the influence of topic information and ineffective utilization of topic features.In light of these phenomena,this paper proposes a social media controversy detection method called Dual Feature Enhanced Graph Convolutional Network(DFE-GCN).This method explores structural information at different scales from global and local perspectives to capture deeper structural features,enhancing the expressive power of structural features.Furthermore,to strengthen the influence of topic information,this paper utilizes attention mechanisms to enhance topic features after each graph convolutional layer,effectively using topic information.We validated our method on two different public datasets,and the experimental results demonstrate that our method achieves state-of-the-art performance compared to baseline methods.On the Weibo and Reddit datasets,the accuracy is improved by 5.92%and 3.32%,respectively,and the F1 score is improved by 1.99%and 2.17%,demonstrating the positive impact of enhanced structural features and topic features on controversy detection.
文摘In thefield of diagnosis of medical images the challenge lies in tracking and identifying the defective cells and the extent of the defective region within the complex structure of a brain cavity.Locating the defective cells precisely during the diagnosis phase helps tofight the greatest exterminator of mankind.Early detec-tion of these defective cells requires an accurate computer-aided diagnostic system(CAD)that supports early treatment and promotes survival rates of patients.An ear-lier version of CAD systems relies greatly on the expertise of radiologist and it con-sumed more time to identify the defective region.The manuscript takes the efficacy of coalescing features like intensity,shape,and texture of the magnetic resonance image(MRI).In the Enhanced Feature Fusion Segmentation based classification method(EEFS)the image is enhanced and segmented to extract the prominent fea-tures.To bring out the desired effect the EEFS method uses Enhanced Local Binary Pattern(EnLBP),Partisan Gray Level Co-occurrence Matrix Histogram of Oriented Gradients(PGLCMHOG),and iGrab cut method to segment image.These prominent features along with deep features are coalesced to provide a single-dimensional fea-ture vector that is effectively used for prediction.The coalesced vector is used with the existing classifiers to compare the results of these classifiers with that of the gen-erated vector.The generated vector provides promising results with commendably less computatio nal time for pre-processing and classification of MR medical images.
基金supported by financial support from Universiti Sains Malaysia(USM)under FRGS Grant Number FRGS/1/2020/TK03/USM/02/1the School of Computer Sciences USM for their support.
文摘The detection of brain disease is an essential issue in medical and research areas.Deep learning techniques have shown promising results in detecting and diagnosing brain diseases using magnetic resonance imaging(MRI)images.These techniques involve training neural networks on large datasets of MRI images,allowing the networks to learn patterns and features indicative of different brain diseases.However,several challenges and limitations still need to be addressed further to improve the accuracy and effectiveness of these techniques.This paper implements a Feature Enhanced Stacked Auto Encoder(FESAE)model to detect brain diseases.The standard stack auto encoder’s results are trivial and not robust enough to boost the system’s accuracy.Therefore,the standard Stack Auto Encoder(SAE)is replaced with a Stacked Feature Enhanced Auto Encoder with a feature enhancement function to efficiently and effectively get non-trivial features with less activation energy froman image.The proposed model consists of four stages.First,pre-processing is performed to remove noise,and the greyscale image is converted to Red,Green,and Blue(RGB)to enhance feature details for discriminative feature extraction.Second,feature Extraction is performed to extract significant features for classification using DiscreteWavelet Transform(DWT)and Channelization.Third,classification is performed to classify MRI images into four major classes:Normal,Tumor,Brain Stroke,and Alzheimer’s.Finally,the FESAE model outperforms the state-of-theart,machine learning,and deep learning methods such as Artificial Neural Network(ANN),SAE,Random Forest(RF),and Logistic Regression(LR)by achieving a high accuracy of 98.61% on a dataset of 2000 MRI images.The proposed model has significant potential for assisting radiologists in diagnosing brain diseases more accurately and improving patient outcomes.
基金The National Natural Science Foundation of China(No.51175267)the Natural Science Foundation of Jiangsu Province(No.BK2010481)+2 种基金the Ph.D.Programs Foundation of Ministry of Education of China(No.20113219120004)China Postdoctoral Science Foundation(No.20100481148)the Postdoctoral Science Foundation of Jiangsu Province(No.1001004B)
文摘To realize high-precision automatic measurement of two-dimensional geometric features on parts, a cooperative measurement system based on machine vision is constructed. Its hardware structure, functional composition and working principle are introduced. The mapping relationship between the feature image coordinates and the measuring space coordinates is established. The method of measuring path planning of small field of view (FOV) images is proposed. With the cooperation of the panoramic image of the object to be measured, the small FOV images with high object plane resolution are acquired automatically. Then, the auxiliary measuring characteristics are constructed and the parameters of the features to be measured are automatically extracted. Experimental results show that the absolute value of relative error is less than 0. 03% when applying the cooperative measurement system to gauge the hole distance of 100 mm nominal size. When the object plane resolving power of the small FOV images is 16 times that of the large FOV image, the measurement accuracy of small FOV images is improved by 14 times compared with the large FOV image. It is suitable for high-precision automatic measurement of two-dimensional complex geometric features distributed on large scale parts.
基金supported in part by the National Natural Science Foundation of China(Grant No.61971078)Chongqing Education Commission Science and Technology Major Project(No.KJZD-M202301901).
文摘While single-modal visible light images or infrared images provide limited information,infrared light captures significant thermal radiation data,whereas visible light excels in presenting detailed texture information.Com-bining images obtained from both modalities allows for leveraging their respective strengths and mitigating individual limitations,resulting in high-quality images with enhanced contrast and rich texture details.Such capabilities hold promising applications in advanced visual tasks including target detection,instance segmentation,military surveillance,pedestrian detection,among others.This paper introduces a novel approach,a dual-branch decomposition fusion network based on AutoEncoder(AE),which decomposes multi-modal features into intensity and texture information for enhanced fusion.Local contrast enhancement module(CEM)and texture detail enhancement module(DEM)are devised to process the decomposed images,followed by image fusion through the decoder.The proposed loss function ensures effective retention of key information from the source images of both modalities.Extensive comparisons and generalization experiments demonstrate the superior performance of our network in preserving pixel intensity distribution and retaining texture details.From the qualitative results,we can see the advantages of fusion details and local contrast.In the quantitative experiments,entropy(EN),mutual information(MI),structural similarity(SSIM)and other results have improved and exceeded the SOTA(State of the Art)model as a whole.
文摘In the era of the Internet,widely used web applications have become the target of hacker attacks because they contain a large amount of personal information.Among these vulnerabilities,stealing private data through crosssite scripting(XSS)attacks is one of the most commonly used attacks by hackers.Currently,deep learning-based XSS attack detection methods have good application prospects;however,they suffer from problems such as being prone to overfitting,a high false alarm rate,and low accuracy.To address these issues,we propose a multi-stage feature extraction and fusion model for XSS detection based on Random Forest feature enhancement.The model utilizes RandomForests to capture the intrinsic structure and patterns of the data by extracting leaf node indices as features,which are subsequentlymergedwith the original data features to forma feature setwith richer information content.Further feature extraction is conducted through three parallel channels.Channel I utilizes parallel onedimensional convolutional layers(1Dconvolutional layers)with different convolutional kernel sizes to extract local features at different scales and performmulti-scale feature fusion;Channel II employsmaximum one-dimensional pooling layers(max 1D pooling layers)of various sizes to extract key features from the data;and Channel III extracts global information bi-directionally using a Bi-Directional Long-Short TermMemory Network(Bi-LSTM)and incorporates a multi-head attention mechanism to enhance global features.Finally,effective classification and prediction of XSS are performed by fusing the features of the three channels.To test the effectiveness of the model,we conduct experiments on six datasets.We achieve an accuracy of 100%on the UNSW-NB15 dataset and 99.99%on the CICIDS2017 dataset,which is higher than that of the existing models.
文摘Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors.
基金supported by the National Natural Science Foundation of China(62205183)the Research Grants Council of Hong Kong(ANR/RGC,Ref.No.A-CUHK404/21).
文摘Light–matter interactions in two-dimensional(2D)materials have been the focus of research since the discovery of graphene.The light–matter interaction length in 2D materials is,however,much shorter than that in bulk materials owing to the atomic nature of 2D materials.Plasmonic nanostructures are usually integrated with 2D materials to enhance the light–matter interactions,offering great opportunities for both fundamental research and technological applications.Nanoparticle-on-mirror(NPo M)structures with extremely confined optical fields are highly desired in this aspect.In addition,2D materials provide a good platform for the study of plasmonic fields with subnanometer resolution and quantum plasmonics down to the characteristic length scale of a single atom.A focused and up-to-date review article is highly desired for a timely summary of the progress in this rapidly growing field and to encourage more research efforts in this direction.In this review,we will first introduce the basic concepts of plasmonic modes in NPo M structures.Interactions between plasmons and quasi-particles in 2D materials,e.g.,excitons and phonons,from weak to strong coupling and potential applications will then be described in detail.Related phenomena in subnanometer metallic gaps separated by 2D materials,such as quantum tunneling,will also be touched.We will finally discuss phenomena and physical processes that have not been understood clearly and provide an outlook for future research.We believe that the hybrid systems of2D materials and NPo M structures will be a promising research field in the future.
文摘Manhole cover defect recognition is of significant practical importance as it can accurately identify damaged or missing covers, enabling timely replacement and maintenance. Traditional manhole cover detection techniques primarily focus on detecting the presence of covers rather than classifying the types of defects. However, manhole cover defects exhibit small inter-class feature differences and large intra-class feature variations, which makes their recognition challenging. To improve the classification of manhole cover defect types, we propose a Progressive Dual-Branch Feature Fusion Network (PDBFFN). The baseline backbone network adopts a multi-stage hierarchical architecture design using Res-Net50 as the visual feature extractor, from which both local and global information is obtained. Additionally, a Feature Enhancement Module (FEM) and a Fusion Module (FM) are introduced to enhance the network’s ability to learn critical features. Experimental results demonstrate that our model achieves a classification accuracy of 82.6% on a manhole cover defect dataset, outperforming several state-of-the-art fine-grained image classification models.
基金funded by(i)Natural Science Foundation China(NSFC)under Grant Nos.61402397,61263043,61562093 and 61663046(ii)Open Foundation of Key Laboratory in Software Engineering of Yunnan Province:No.2020SE304.(iii)Practical Innovation Project of Yunnan University,Project Nos.2021z34,2021y128 and 2021y129.
文摘Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providing an important decision-making function for sustainable transportation.In order to provide a comprehensive and reasonable description of complex traffic scenes,a traffic scene semantic captioningmodel withmulti-stage feature enhancement is proposed in this paper.In general,the model follows an encoder-decoder structure.First,multilevel granularity visual features are used for feature enhancement during the encoding process,which enables the model to learn more detailed content in the traffic scene image.Second,the scene knowledge graph is applied to the decoding process,and the semantic features provided by the scene knowledge graph are used to enhance the features learned by the decoder again,so that themodel can learn the attributes of objects in the traffic scene and the relationships between objects to generate more reasonable captions.This paper reports extensive experiments on the challenging MS-COCO dataset,evaluated by five standard automatic evaluation metrics,and the results show that the proposed model has improved significantly in all metrics compared with the state-of-the-art methods,especially achieving a score of 129.0 on the CIDEr-D evaluation metric,which also indicates that the proposed model can effectively provide a more reasonable and comprehensive description of the traffic scene.
基金supported by the Sichuan Science and Technology Program under Grants No.2022YFQ0052 and No.2021YFQ0009.
文摘At present,knowledge embedding methods are widely used in the field of knowledge graph(KG)reasoning,and have been successfully applied to those with large entities and relationships.However,in research and production environments,there are a large number of KGs with a small number of entities and relations,which are called sparse KGs.Limited by the performance of knowledge extraction methods or some other reasons(some common-sense information does not appear in the natural corpus),the relation between entities is often incomplete.To solve this problem,a method of the graph neural network and information enhancement is proposed.The improved method increases the mean reciprocal rank(MRR)and Hit@3 by 1.6%and 1.7%,respectively,when the sparsity of the FB15K-237 dataset is 10%.When the sparsity is 50%,the evaluation indexes MRR and Hit@10 are increased by 0.8%and 1.8%,respectively.
文摘Proposed system has been developed to extract the optimal features from the breast tumors using Enhanced Cuckoo Search (ECS) and presented in this paper. The texture feature, intensity histogram feature, radial distance feature and shape features have been extracted and the optimal feature set has been obtained using ECS. The overall accuracy of a minimum distance classifier and k-Nearest Neighbor (k-NN) on validation samples is used as a fitness value for ECS. The new approach is carried out on the extracted feature dataset. The proposed system selects only the minimum number of features and performed the accuracy of 98.75% with Minimum Distance Classifier and 99.13% with k-NN Classifier. The performance of the new ECS is compared with the Cuckoo Search and Harmony Search. This result shows that the ECS algorithm is more accurate than the other algorithm. The proposed system can provide valuable information to the physician in medical pathology.
基金The National Natural Science Foundation of China (No.61231002,61273266,51075068,60872073,60975017, 61003131)the Ph.D.Programs Foundation of the Ministry of Education of China(No.20110092130004)+1 种基金the Science Foundation for Young Talents in the Educational Committee of Anhui Province(No. 2010SQRL018)the 211 Project of Anhui University(No.2009QN027B)
文摘A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize the enhanced whisper. A novel noise robust feature called Gammatone feature cosine coefficients (GFCCs) extracted by an auditory periphery model is derived and used for the binary mask estimation. The intelligibility performance of the proposed method is evaluated and compared with the traditional speech enhancement methods. Objective and subjective evaluation results indicate that the proposed method can effectively improve the intelligibility of whispered speech which is contaminated by noise. Compared with the power subtract algorithm and the log-MMSE algorithm, both of which do not improve the intelligibility in lower signal-to-noise ratio (SNR) environments, the proposed method has good performance in improving the intelligibility of noisy whisper. Additionally, the intelligibility of the enhanced whispered speech using the proposed method also outperforms that of the corresponding unprocessed noisy whispered speech.
文摘The recent development of cardiac magnetic resonance(CMR)techniques has allowed detailed analyses of cardiac function and tissue characterization with high spatial resolution.We review characteristic CMR features in ischemic and non-ischemic cardiomyopathies(ICM and NICM),especially in terms of the location and distribution of late gadolinium enhancement(LGE).CMR in ICM shows segmental wall motion abnormalities or wall thinning in a particular coronary arterial territory,and the subendocardial or transmural LGE.LGE in NICM generally does not correspond to any particular coronary artery distribution and is located mostly in the mid-wall to subepicardial layer.The analysis of LGE distribution is valuable to differentiate NICM with diffusely impaired systolic function,including dilated cardiomyopathy,end-stage hypertrophic cardiomyopathy(HCM),cardiac sarcoidosis,and myocarditis,and those with diffuse left ventricular(LV)hypertrophy including HCM,cardiac amyloidosis and Anderson-Fabry disease.A transient low signal intensity LGE in regions of severe LV dysfunction is a particular feature of stress cardiomyopathy.In arrhythmogenic right ventricular cardiomyopathy/dysplasia,an enhancement of right ventricular(RV)wall with functional and morphological changes of RV becomes apparent.Finally,the analyses of LGE distribution have potentials to predict cardiac outcomes and response to treatments.
基金supported by“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP),granted financial resources from the Ministry of Trade,Industry&Energy,Republic of Korea(No.20204010600090)Supporting Project Number(PNURSP2023R387),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Diagnosing gastrointestinal cancer by classical means is a hazardous procedure.Years have witnessed several computerized solutions for stomach disease detection and classification.However,the existing techniques faced challenges,such as irrelevant feature extraction,high similarity among different disease symptoms,and the least-important features from a single source.This paper designed a new deep learning-based architecture based on the fusion of two models,Residual blocks and Auto Encoder.First,the Hyper-Kvasir dataset was employed to evaluate the proposed work.The research selected a pre-trained convolutional neural network(CNN)model and improved it with several residual blocks.This process aims to improve the learning capability of deep models and lessen the number of parameters.Besides,this article designed an Auto-Encoder-based network consisting of five convolutional layers in the encoder stage and five in the decoder phase.The research selected the global average pooling and convolutional layers for the feature extraction optimized by a hybrid Marine Predator optimization and Slime Mould optimization algorithm.These features of both models are fused using a novel fusion technique that is later classified using the Artificial Neural Network classifier.The experiment worked on the HyperKvasir dataset,which consists of 23 stomach-infected classes.At last,the proposed method obtained an improved accuracy of 93.90%on this dataset.Comparison is also conducted with some recent techniques and shows that the proposed method’s accuracy is improved.
基金supported by“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)Granted Financial Resources from theMinistry of Trade,Industry&Energy,Republic of Korea(No.20204010600090).
文摘Manual diagnosis of brain tumors usingmagnetic resonance images(MRI)is a hectic process and time-consuming.Also,it always requires an expert person for the diagnosis.Therefore,many computer-controlled methods for diagnosing and classifying brain tumors have been introduced in the literature.This paper proposes a novel multimodal brain tumor classification framework based on two-way deep learning feature extraction and a hybrid feature optimization algorithm.NasNet-Mobile,a pre-trained deep learning model,has been fine-tuned and twoway trained on original and enhancedMRI images.The haze-convolutional neural network(haze-CNN)approach is developed and employed on the original images for contrast enhancement.Next,transfer learning(TL)is utilized for training two-way fine-tuned models and extracting feature vectors from the global average pooling layer.Then,using a multiset canonical correlation analysis(CCA)method,features of both deep learning models are fused into a single feature matrix—this technique aims to enhance the information in terms of features for better classification.Although the information was increased,computational time also jumped.This issue is resolved using a hybrid feature optimization algorithm that chooses the best classification features.The experiments were done on two publicly available datasets—BraTs2018 and BraTs2019—and yielded accuracy rates of 94.8%and 95.7%,respectively.The proposedmethod is comparedwith several recent studies andoutperformed inaccuracy.In addition,we analyze the performance of each middle step of the proposed approach and find the selection technique strengthens the proposed framework.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.11834008,11704410,11632004,11474361,and U1930202).
文摘On the basis of second-order perturbation approximate and modal expansion approach,we investigate the enhancement effect of cumulative second-harmonic generation(SHG)of circumferential guided waves(CGWs)in a circular tube,which is inherently induced by the closed propagation feature of CGWs.An appropriate mode pair of primary-and double-frequency CGWs satisfying the phase velocity matching and nonzero energy flux is selected to ensure that the second harmonic generated by primary CGW propagation can accumulate along the circumference.Using a coherent superposition of multi-waves,a model of unidirectional CGW propagation is established for analyzing the enhancement effect of cumulative SHG of primary CGW mode selected.The theoretical analyses and numerical simulations performed directly demonstrate that the second harmonic generated does have a cumulative effect along the circumferential direction and the closed propagation feature of CGWs does enhance the magnitude of cumulative second harmonic generated.Potential applications of the enhancement effect of cumulative SHG of CGWs are considered and discussed.The theoretical analysis and numerical simulation perspective presented here yield an insight previously unavailable into the physical mechanism of the enhancement effect of cumulative SHG by closed propagation feature of CGWs in a circular tube.
基金supported by Scientific Research Project of Tianjin Education Commission(Nos.2020KJ091,2018KJ184)National Key Research and Development Program of China(No.2020YFD0900600)+1 种基金the Earmarked Fund for CARS(No.CARS-47)Tianjin Mariculture Industry Technology System Innovation Team Construction Project(No.ITTMRS2021000)。
文摘Sea cucumber detection is widely recognized as the key to automatic culture.The underwater light environment is complex and easily obscured by mud,sand,reefs,and other underwater organisms.To date,research on sea cucumber detection has mostly concentrated on the distinction between prospective objects and the background.However,the key to proper distinction is the effective extraction of sea cucumber feature information.In this study,the edge-enhanced scaling You Only Look Once-v4(YOLOv4)(ESYv4)was proposed for sea cucumber detection.By emphasizing the target features in a way that reduced the impact of different hues and brightness values underwater on the misjudgment of sea cucumbers,a bidirectional cascade network(BDCN)was used to extract the overall edge greyscale image in the image and add up the original RGB image as the detected input.Meanwhile,the YOLOv4 model for backbone detection is scaled,and the number of parameters is reduced to 48%of the original number of parameters.Validation results of 783images indicated that the detection precision of positive sea cucumber samples reached 0.941.This improvement reflects that the algorithm is more effective to improve the edge feature information of the target.It thus contributes to the automatic multi-objective detection of underwater sea cucumbers.
基金the Liaoning Provincial Department of Education 2021 Annual Scientific Research Funding Program(Grant Numbers LJKZ0535,LJKZ0526)the 2021 Annual Comprehensive Reform of Undergraduate Education Teaching(Grant Numbers JGLX2021020,JCLX2021008)Graduate Innovation Fund of Dalian Polytechnic University(Grant Number 2023CXYJ13).
文摘In pursuit of cost-effective manufacturing,enterprises are increasingly adopting the practice of utilizing recycled semiconductor chips.To ensure consistent chip orientation during packaging,a circular marker on the front side is employed for pin alignment following successful functional testing.However,recycled chips often exhibit substantial surface wear,and the identification of the relatively small marker proves challenging.Moreover,the complexity of generic target detection algorithms hampers seamless deployment.Addressing these issues,this paper introduces a lightweight YOLOv8s-based network tailored for detecting markings on recycled chips,termed Van-YOLOv8.Initially,to alleviate the influence of diminutive,low-resolution markings on the precision of deep learning models,we utilize an upscaling approach for enhanced resolution.This technique relies on the Super-Resolution Generative Adversarial Network with Extended Training(SRGANext)network,facilitating the reconstruction of high-fidelity images that align with input specifications.Subsequently,we replace the original YOLOv8smodel’s backbone feature extraction network with the lightweight VanillaNetwork(VanillaNet),simplifying the branch structure to reduce network parameters.Finally,a Hybrid Attention Mechanism(HAM)is implemented to capture essential details from input images,improving feature representation while concurrently expediting model inference speed.Experimental results demonstrate that the Van-YOLOv8 network outperforms the original YOLOv8s on a recycled chip dataset in various aspects.Significantly,it demonstrates superiority in parameter count,computational intricacy,precision in identifying targets,and speed when compared to certain prevalent algorithms in the current landscape.The proposed approach proves promising for real-time detection of recycled chips in practical factory settings.