Multi-material laser-based powder bed fusion (PBF-LB) allows manufacturing of parts with 3-dimensional gradient and additional functionality in a single step. This research focuses on the combination of thermally-cond...Multi-material laser-based powder bed fusion (PBF-LB) allows manufacturing of parts with 3-dimensional gradient and additional functionality in a single step. This research focuses on the combination of thermally-conductive CuCr1Zr with hard M300 tool steel.Two interface configurations of M300 on CuCr1Zr and CuCr1Zr on M300 were investigated. Ultra-fine grains form at the interface due to the low mutual solubility of Cu and steel. The material mixing zone size is dependent on the configurations and tunable in the range of0.1–0.3 mm by introducing a separate set of parameters for the interface layers. Microcracks and pores mainly occur in the transition zone.Regardless of these defects, the thermal diffusivity of bimetallic parts with 50vol% of CuCr1Zr significantly increases by 70%–150%compared to pure M300. The thermal diffusivity of CuCr1Zr and the hardness of M300 steel can be enhanced simultaneously by applying the aging heat treatment.展开更多
This paper proposes a multi-material topology optimization method based on the hybrid reliability of the probability-ellipsoid model with stress constraint for the stochastic uncertainty and epistemic uncertainty of m...This paper proposes a multi-material topology optimization method based on the hybrid reliability of the probability-ellipsoid model with stress constraint for the stochastic uncertainty and epistemic uncertainty of mechanical loads in optimization design.The probabilistic model is combined with the ellipsoidal model to describe the uncertainty of mechanical loads.The topology optimization formula is combined with the ordered solid isotropic material with penalization(ordered-SIMP)multi-material interpolation model.The stresses of all elements are integrated into a global stress measurement that approximates the maximum stress using the normalized p-norm function.Furthermore,the sequential optimization and reliability assessment(SORA)is applied to transform the original uncertainty optimization problem into an equivalent deterministic topology optimization(DTO)problem.Stochastic response surface and sparse grid technique are combined with SORA to get accurate information on the most probable failure point(MPP).In each cycle,the equivalent topology optimization formula is updated according to the MPP information obtained in the previous cycle.The adjoint variable method is used for deriving the sensitivity of the stress constraint and the moving asymptote method(MMA)is used to update design variables.Finally,the validity and feasibility of the method are verified by the numerical example of L-shape beam design,T-shape structure design,steering knuckle,and 3D T-shaped beam.展开更多
In recent years,there has been significant research on the application of deep learning(DL)in topology optimization(TO)to accelerate structural design.However,these methods have primarily focused on solving binary TO ...In recent years,there has been significant research on the application of deep learning(DL)in topology optimization(TO)to accelerate structural design.However,these methods have primarily focused on solving binary TO problems,and effective solutions for multi-material topology optimization(MMTO)which requires a lot of computing resources are still lacking.Therefore,this paper proposes the framework of multiphase topology optimization using deep learning to accelerate MMTO design.The framework employs convolutional neural network(CNN)to construct a surrogate model for solving MMTO,and the obtained surrogate model can rapidly generate multi-material structure topologies in negligible time without any iterations.The performance evaluation results show that the proposed method not only outputs multi-material topologies with clear material boundary but also reduces the calculation cost with high prediction accuracy.Additionally,in order to find a more reasonable modeling method for MMTO,this paper studies the characteristics of surrogate modeling as regression task and classification task.Through the training of 297 models,our findings show that the regression task yields slightly better results than the classification task in most cases.Furthermore,The results indicate that the prediction accuracy is primarily influenced by factors such as the TO problem,material category,and data scale.Conversely,factors such as the domain size and the material property have minimal impact on the accuracy.展开更多
In order to mimic the natural heterogeneity of native tissue and provide a better microenvironment for cell culturing,multi-material bioprinting has become a common solution to construct tissue models in vitro.With th...In order to mimic the natural heterogeneity of native tissue and provide a better microenvironment for cell culturing,multi-material bioprinting has become a common solution to construct tissue models in vitro.With the embedded printing method,complex 3D structure can be printed using soft biomaterials with reasonable shape fidelity.However,the current sequential multi-material embedded printing method faces a major challenge,which is the inevitable trade-off between the printed structural integrity and printing precision.Here,we propose a simultaneous multi-material embedded printing method.With this method,we can easily print firmly attached and high-precision multilayer structures.With multiple individually controlled nozzles,different biomaterials can be precisely deposited into a single crevasse,minimizing uncontrolled squeezing and guarantees no contamination of embedding medium within the structure.We analyse the dynamics of the extruded bioink in the embedding medium both analytically and experimentally,and quantitatively evaluate the effects of printing parameters including printing speed and rheology of embedding medium,on the 3D morphology of the printed filament.We demonstrate the printing of double-layer thin-walled structures,each layer less than 200μm,as well as intestine and liver models with 5%gelatin methacryloyl that are crosslinked and extracted from the embedding medium without significant impairment or delamination.The peeling test further proves that the proposed method offers better structural integrity than conventional sequential printing methods.The proposed simultaneous multi-material embedded printing method can serve as a powerful tool to support the complex heterogeneous structure fabrication and open unique prospects for personalized medicine.展开更多
Three-dimensional printing technologies exhibit tremendous potential in the advancing fields of tissue engineering and regenerative medicine due to the precise spatial control over depositing the biomaterial.Despite t...Three-dimensional printing technologies exhibit tremendous potential in the advancing fields of tissue engineering and regenerative medicine due to the precise spatial control over depositing the biomaterial.Despite their widespread utilization and numerous advantages,the development of suitable novel biomaterials for extrusion-based 3D printing of scaffolds that support cell attachment,proliferation,and vascularization remains a challenge.Multi-material composite hydrogels present incredible potential in this field.Thus,in this work,a multi-material composite hydrogel with a promising formulation of chitosan/gelatin functionalized with egg white was developed,which provides good printability and shape fidelity.In addition,a series of comparative analyses of different crosslinking agents and processes based on tripolyphosphate(TPP),genipin(GP),and glutaraldehyde(GTA)were investigated and compared to select the ideal crosslinking strategy to enhance the physicochemical and biological properties of the fabricated scaffolds.All of the results indicate that the composite hydrogel and the resulting scaffolds utilizing TPP crosslinking have great potential in tissue engineering,especially for supporting neo-vessel growth into the scaffold and promoting angiogenesis within engineered tissues.展开更多
Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unman...Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.展开更多
Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occ...Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occlusion,and limited labeled data.To address these challenges,we introduce a comprehensive methodology toenhance image classification and object detection accuracy.The proposed approach involves the integration ofmultiple methods in a complementary way.The process commences with the application of Gaussian filters tomitigate the impact of noise interference.These images are then processed for segmentation using Fuzzy C-Meanssegmentation in parallel with saliency mapping techniques to find the most prominent regions.The Binary RobustIndependent Elementary Features(BRIEF)characteristics are then extracted fromdata derived fromsaliency mapsand segmented images.For precise object separation,Oriented FAST and Rotated BRIEF(ORB)algorithms areemployed.Genetic Algorithms(GAs)are used to optimize Random Forest classifier parameters which lead toimproved performance.Our method stands out due to its comprehensive approach,adeptly addressing challengessuch as changing backdrops,occlusion,and limited labeled data concurrently.A significant enhancement hasbeen achieved by integrating Genetic Algorithms(GAs)to precisely optimize parameters.This minor adjustmentnot only boosts the uniqueness of our system but also amplifies its overall efficacy.The proposed methodologyhas demonstrated notable classification accuracies of 90.9%and 89.0%on the challenging Corel-1k and MSRCdatasets,respectively.Furthermore,detection accuracies of 87.2%and 86.6%have been attained.Although ourmethod performed well in both datasets it may face difficulties in real-world data especially where datasets havehighly complex backgrounds.Despite these limitations,GAintegration for parameter optimization shows a notablestrength in enhancing the overall adaptability and performance of our system.展开更多
Confusing object detection(COD),such as glass,mirrors,and camouflaged objects,represents a burgeoning visual detection task centered on pinpointing and distinguishing concealed targets within intricate backgrounds,lev...Confusing object detection(COD),such as glass,mirrors,and camouflaged objects,represents a burgeoning visual detection task centered on pinpointing and distinguishing concealed targets within intricate backgrounds,leveraging deep learning methodologies.Despite garnering increasing attention in computer vision,the focus of most existing works leans toward formulating task-specific solutions rather than delving into in-depth analyses of methodological structures.As of now,there is a notable absence of a comprehensive systematic review that focuses on recently proposed deep learning-based models for these specific tasks.To fill this gap,our study presents a pioneering review that covers both themodels and the publicly available benchmark datasets,while also identifying potential directions for future research in this field.The current dataset primarily focuses on single confusing object detection at the image level,with some studies extending to video-level data.We conduct an in-depth analysis of deep learning architectures,revealing that the current state-of-the-art(SOTA)COD methods demonstrate promising performance in single object detection.We also compile and provide detailed descriptions ofwidely used datasets relevant to these detection tasks.Our endeavor extends to discussing the limitations observed in current methodologies,alongside proposed solutions aimed at enhancing detection accuracy.Additionally,we deliberate on relevant applications and outline future research trajectories,aiming to catalyze advancements in the field of glass,mirror,and camouflaged object detection.展开更多
Discovering floating wastes,especially bottles on water,is a crucial research problem in environmental hygiene.Nevertheless,real-world applications often face challenges such as interference from irrelevant objects an...Discovering floating wastes,especially bottles on water,is a crucial research problem in environmental hygiene.Nevertheless,real-world applications often face challenges such as interference from irrelevant objects and the high cost associated with data collection.Consequently,devising algorithms capable of accurately localizing specific objects within a scene in scenarios where annotated data is limited remains a formidable challenge.To solve this problem,this paper proposes an object discovery by request problem setting and a corresponding algorithmic framework.The proposed problem setting aims to identify specified objects in scenes,and the associated algorithmic framework comprises pseudo data generation and object discovery by request network.Pseudo-data generation generates images resembling natural scenes through various data augmentation rules,using a small number of object samples and scene images.The network structure of object discovery by request utilizes the pre-trained Vision Transformer(ViT)model as the backbone,employs object-centric methods to learn the latent representations of foreground objects,and applies patch-level reconstruction constraints to the model.During the validation phase,we use the generated pseudo datasets as training sets and evaluate the performance of our model on the original test sets.Experiments have proved that our method achieves state-of-the-art performance on Unmanned Aerial Vehicles-Bottle Detection(UAV-BD)dataset and self-constructed dataset Bottle,especially in multi-object scenarios.展开更多
Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain les...Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors.展开更多
Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,w...Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,we propose Rail-PillarNet,a three-dimensional(3D)LIDAR(Light Detection and Ranging)railway foreign object detection method based on the improvement of PointPillars.Firstly,the parallel attention pillar encoder(PAPE)is designed to fully extract the features of the pillars and alleviate the problem of local fine-grained information loss in PointPillars pillars encoder.Secondly,a fine backbone network is designed to improve the feature extraction capability of the network by combining the coding characteristics of LIDAR point cloud feature and residual structure.Finally,the initial weight parameters of the model were optimised by the transfer learning training method to further improve accuracy.The experimental results on the OSDaR23 dataset show that the average accuracy of Rail-PillarNet reaches 58.51%,which is higher than most mainstream models,and the number of parameters is 5.49 M.Compared with PointPillars,the accuracy of each target is improved by 10.94%,3.53%,16.96%and 19.90%,respectively,and the number of parameters only increases by 0.64M,which achieves a balance between the number of parameters and accuracy.展开更多
The data analysis of blasting sites has always been the research goal of relevant researchers.The rise of mobile blasting robots has aroused many researchers’interest in machine learning methods for target detection ...The data analysis of blasting sites has always been the research goal of relevant researchers.The rise of mobile blasting robots has aroused many researchers’interest in machine learning methods for target detection in the field of blasting.Serverless Computing can provide a variety of computing services for people without hardware foundations and rich software development experience,which has aroused people’s interest in how to use it in the field ofmachine learning.In this paper,we design a distributedmachine learning training application based on the AWS Lambda platform.Based on data parallelism,the data aggregation and training synchronization in Function as a Service(FaaS)are effectively realized.It also encrypts the data set,effectively reducing the risk of data leakage.We rent a cloud server and a Lambda,and then we conduct experiments to evaluate our applications.Our results indicate the effectiveness,rapidity,and economy of distributed training on FaaS.展开更多
Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have becom...Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.展开更多
In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,...In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,diagnosis and evaluation of kidney and urinary tract disease,providing insight into the specific type and severity.However,manual urine sediment examination is labor-intensive,time-consuming,and subjective.Traditional machine learning based object detection methods require hand-crafted features for localization and classification,which have poor generalization capabilities and are difficult to quickly and accurately detect the number of urine sediments.Deep learning based object detection methods have the potential to address the challenges mentioned above,but these methods require access to large urine sediment image datasets.Unfortunately,only a limited number of publicly available urine sediment datasets are currently available.To alleviate the lack of urine sediment datasets in medical image analysis,we propose a new dataset named UriSed2K,which contains 2465 high-quality images annotated with expert guidance.Two main challenges are associated with our dataset:a large number of small objects and the occlusion between these small objects.Our manuscript focuses on applying deep learning object detection methods to the urine sediment dataset and addressing the challenges presented by this dataset.Specifically,our goal is to improve the accuracy and efficiency of the detection algorithm and,in doing so,provide medical professionals with an automatic detector that saves time and effort.We propose an improved lightweight one-stage object detection algorithm called Discriminatory-YOLO.The proposed algorithm comprises a local context attention module and a global background suppression module,which aid the detector in distinguishing urine sediment features in the image.The local context attention module captures context information beyond the object region,while the global background suppression module emphasizes objects in uninformative backgrounds.We comprehensively evaluate our method on the UriSed2K dataset,which includes seven categories of urine sediments,such as erythrocytes(red blood cells),leukocytes(white blood cells),epithelial cells,crystals,mycetes,broken erythrocytes,and broken leukocytes,achieving the best average precision(AP)of 95.3%while taking only 10 ms per image.The source code and dataset are available at https://github.com/binghuiwu98/discriminatoryyolov5.展开更多
Object detection finds wide application in various sectors,including autonomous driving,industry,and healthcare.Recent studies have highlighted the vulnerability of object detection models built using deep neural netw...Object detection finds wide application in various sectors,including autonomous driving,industry,and healthcare.Recent studies have highlighted the vulnerability of object detection models built using deep neural networks when confronted with carefully crafted adversarial examples.This not only reveals their shortcomings in defending against malicious attacks but also raises widespread concerns about the security of existing systems.Most existing adversarial attack strategies focus primarily on image classification problems,failing to fully exploit the unique characteristics of object detectionmodels,thus resulting in widespread deficiencies in their transferability.Furthermore,previous research has predominantly concentrated on the transferability issues of non-targeted attacks,whereas enhancing the transferability of targeted adversarial examples presents even greater challenges.Traditional attack techniques typically employ cross-entropy as a loss measure,iteratively adjusting adversarial examples to match target categories.However,their inherent limitations restrict their broad applicability and transferability across different models.To address the aforementioned challenges,this study proposes a novel targeted adversarial attack method aimed at enhancing the transferability of adversarial samples across object detection models.Within the framework of iterative attacks,we devise a new objective function designed to mitigate consistency issues arising from cumulative noise and to enhance the separation between target and non-target categories(logit margin).Secondly,a data augmentation framework incorporating random erasing and color transformations is introduced into targeted adversarial attacks.This enhances the diversity of gradients,preventing overfitting to white-box models.Lastly,perturbations are applied only within the specified object’s bounding box to reduce the perturbation range,enhancing attack stealthiness.Experiments were conducted on the Microsoft Common Objects in Context(MS COCO)dataset using You Only Look Once version 3(YOLOv3),You Only Look Once version 8(YOLOv8),Faster Region-based Convolutional Neural Networks(Faster R-CNN),and RetinaNet.The results demonstrate a significant advantage of the proposed method in black-box settings.Among these,the success rate of RetinaNet transfer attacks reached a maximum of 82.59%.展开更多
Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of labelling.However,there is a large performance gap between weakly supervised and fully superv...Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of labelling.However,there is a large performance gap between weakly supervised and fully supervised salient object detectors because the scribble annotation can only provide very limited foreground/background information.Therefore,an intuitive idea is to infer annotations that cover more complete object and background regions for training.To this end,a label inference strategy is proposed based on the assumption that pixels with similar colours and close positions should have consistent labels.Specifically,k-means clustering algorithm was first performed on both colours and coordinates of original annotations,and then assigned the same labels to points having similar colours with colour cluster centres and near coordinate cluster centres.Next,the same annotations for pixels with similar colours within each kernel neighbourhood was set further.Extensive experiments on six benchmarks demonstrate that our method can significantly improve the performance and achieve the state-of-the-art results.展开更多
The advancement of navigation systems for the visually impaired has significantly enhanced their mobility by mitigating the risk of encountering obstacles and guiding them along safe,navigable routes.Traditional appro...The advancement of navigation systems for the visually impaired has significantly enhanced their mobility by mitigating the risk of encountering obstacles and guiding them along safe,navigable routes.Traditional approaches primarily focus on broad applications such as wayfinding,obstacle detection,and fall prevention.However,there is a notable discrepancy in applying these technologies to more specific scenarios,like identifying distinct food crop types or recognizing faces.This study proposes a real-time application designed for visually impaired individuals,aiming to bridge this research-application gap.It introduces a system capable of detecting 20 different food crop types and recognizing faces with impressive accuracies of 83.27%and 95.64%,respectively.These results represent a significant contribution to the field of assistive technologies,providing visually impaired users with detailed and relevant information about their surroundings,thereby enhancing their mobility and ensuring their safety.Additionally,it addresses the vital aspects of social engagements,acknowledging the challenges faced by visually impaired individuals in recognizing acquaintances without auditory or tactile signals,and highlights recent developments in prototype systems aimed at assisting with face recognition tasks.This comprehensive approach not only promises enhanced navigational aids but also aims to enrich the social well-being and safety of visually impaired communities.展开更多
Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false...Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.展开更多
Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variati...Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.展开更多
In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in re...In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in remote sensing remains a formidable challenge.The deep network structure will bring about the loss of object features,resulting in the loss of object features and the near elimination of some subtle features associated with small objects in deep layers.Additionally,the features of small objects are susceptible to interference from background features contained within the image,leading to a decline in detection accuracy.Moreover,the sensitivity of small objects to the bounding box perturbation further increases the detection difficulty.In this paper,we introduce a novel approach,Cross-Layer Fusion and Weighted Receptive Field-based YOLO(CAW-YOLO),specifically designed for small object detection in remote sensing.To address feature loss in deep layers,we have devised a cross-layer attention fusion module.Background noise is effectively filtered through the incorporation of Bi-Level Routing Attention(BRA).To enhance the model’s capacity to perceive multi-scale objects,particularly small-scale objects,we introduce a weightedmulti-receptive field atrous spatial pyramid poolingmodule.Furthermore,wemitigate the sensitivity arising from bounding box perturbation by incorporating the joint Normalized Wasserstein Distance(NWD)and Efficient Intersection over Union(EIoU)losses.The efficacy of the proposedmodel in detecting small objects in remote sensing has been validated through experiments conducted on three publicly available datasets.The experimental results unequivocally demonstrate the model’s pronounced advantages in small object detection for remote sensing,surpassing the performance of current mainstream models.展开更多
基金supported by VTT Technical Research Centre of Finland,Aalto University,Aerosint SA,and partially from European Union Horizon 2020 (No.768775)。
文摘Multi-material laser-based powder bed fusion (PBF-LB) allows manufacturing of parts with 3-dimensional gradient and additional functionality in a single step. This research focuses on the combination of thermally-conductive CuCr1Zr with hard M300 tool steel.Two interface configurations of M300 on CuCr1Zr and CuCr1Zr on M300 were investigated. Ultra-fine grains form at the interface due to the low mutual solubility of Cu and steel. The material mixing zone size is dependent on the configurations and tunable in the range of0.1–0.3 mm by introducing a separate set of parameters for the interface layers. Microcracks and pores mainly occur in the transition zone.Regardless of these defects, the thermal diffusivity of bimetallic parts with 50vol% of CuCr1Zr significantly increases by 70%–150%compared to pure M300. The thermal diffusivity of CuCr1Zr and the hardness of M300 steel can be enhanced simultaneously by applying the aging heat treatment.
基金supported by the National Natural Science Foundation of China(Grant 52175236).
文摘This paper proposes a multi-material topology optimization method based on the hybrid reliability of the probability-ellipsoid model with stress constraint for the stochastic uncertainty and epistemic uncertainty of mechanical loads in optimization design.The probabilistic model is combined with the ellipsoidal model to describe the uncertainty of mechanical loads.The topology optimization formula is combined with the ordered solid isotropic material with penalization(ordered-SIMP)multi-material interpolation model.The stresses of all elements are integrated into a global stress measurement that approximates the maximum stress using the normalized p-norm function.Furthermore,the sequential optimization and reliability assessment(SORA)is applied to transform the original uncertainty optimization problem into an equivalent deterministic topology optimization(DTO)problem.Stochastic response surface and sparse grid technique are combined with SORA to get accurate information on the most probable failure point(MPP).In each cycle,the equivalent topology optimization formula is updated according to the MPP information obtained in the previous cycle.The adjoint variable method is used for deriving the sensitivity of the stress constraint and the moving asymptote method(MMA)is used to update design variables.Finally,the validity and feasibility of the method are verified by the numerical example of L-shape beam design,T-shape structure design,steering knuckle,and 3D T-shaped beam.
基金supported in part by National Natural Science Foundation of China under Grant Nos.51675525,52005505,and 62001502Post-Graduate Scientific Research Innovation Project of Hunan Province under Grant No.XJCX2023185.
文摘In recent years,there has been significant research on the application of deep learning(DL)in topology optimization(TO)to accelerate structural design.However,these methods have primarily focused on solving binary TO problems,and effective solutions for multi-material topology optimization(MMTO)which requires a lot of computing resources are still lacking.Therefore,this paper proposes the framework of multiphase topology optimization using deep learning to accelerate MMTO design.The framework employs convolutional neural network(CNN)to construct a surrogate model for solving MMTO,and the obtained surrogate model can rapidly generate multi-material structure topologies in negligible time without any iterations.The performance evaluation results show that the proposed method not only outputs multi-material topologies with clear material boundary but also reduces the calculation cost with high prediction accuracy.Additionally,in order to find a more reasonable modeling method for MMTO,this paper studies the characteristics of surrogate modeling as regression task and classification task.Through the training of 297 models,our findings show that the regression task yields slightly better results than the classification task in most cases.Furthermore,The results indicate that the prediction accuracy is primarily influenced by factors such as the TO problem,material category,and data scale.Conversely,factors such as the domain size and the material property have minimal impact on the accuracy.
基金the support by National Key Research and Development Program of China(2018YFA0703000)National Natural Science Foundation of China(Grant No.52105310)+1 种基金Natural Science Foundation of Zhejiang Province(Grant No.LDQ23E050001)the Starry Night Science Fund of Zhejiang University Shanghai Institute for Advanced Study(Grant No.SN-ZJU-SIAS-004)。
文摘In order to mimic the natural heterogeneity of native tissue and provide a better microenvironment for cell culturing,multi-material bioprinting has become a common solution to construct tissue models in vitro.With the embedded printing method,complex 3D structure can be printed using soft biomaterials with reasonable shape fidelity.However,the current sequential multi-material embedded printing method faces a major challenge,which is the inevitable trade-off between the printed structural integrity and printing precision.Here,we propose a simultaneous multi-material embedded printing method.With this method,we can easily print firmly attached and high-precision multilayer structures.With multiple individually controlled nozzles,different biomaterials can be precisely deposited into a single crevasse,minimizing uncontrolled squeezing and guarantees no contamination of embedding medium within the structure.We analyse the dynamics of the extruded bioink in the embedding medium both analytically and experimentally,and quantitatively evaluate the effects of printing parameters including printing speed and rheology of embedding medium,on the 3D morphology of the printed filament.We demonstrate the printing of double-layer thin-walled structures,each layer less than 200μm,as well as intestine and liver models with 5%gelatin methacryloyl that are crosslinked and extracted from the embedding medium without significant impairment or delamination.The peeling test further proves that the proposed method offers better structural integrity than conventional sequential printing methods.The proposed simultaneous multi-material embedded printing method can serve as a powerful tool to support the complex heterogeneous structure fabrication and open unique prospects for personalized medicine.
基金The authors acknowledge the funding support from the National Natural Science Foundation of China(Nos.52175474 and 51775324)the China Scholarship Council(No.202006890054).
文摘Three-dimensional printing technologies exhibit tremendous potential in the advancing fields of tissue engineering and regenerative medicine due to the precise spatial control over depositing the biomaterial.Despite their widespread utilization and numerous advantages,the development of suitable novel biomaterials for extrusion-based 3D printing of scaffolds that support cell attachment,proliferation,and vascularization remains a challenge.Multi-material composite hydrogels present incredible potential in this field.Thus,in this work,a multi-material composite hydrogel with a promising formulation of chitosan/gelatin functionalized with egg white was developed,which provides good printability and shape fidelity.In addition,a series of comparative analyses of different crosslinking agents and processes based on tripolyphosphate(TPP),genipin(GP),and glutaraldehyde(GTA)were investigated and compared to select the ideal crosslinking strategy to enhance the physicochemical and biological properties of the fabricated scaffolds.All of the results indicate that the composite hydrogel and the resulting scaffolds utilizing TPP crosslinking have great potential in tissue engineering,especially for supporting neo-vessel growth into the scaffold and promoting angiogenesis within engineered tissues.
基金This research was funded by the Natural Science Foundation of Hebei Province(F2021506004).
文摘Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.
基金a grant from the Basic Science Research Program through the National Research Foundation(NRF)(2021R1F1A1063634)funded by the Ministry of Science and ICT(MSIT)Republic of Korea.This research is supported and funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2024R410)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding program Grant Code(NU/RG/SERC/12/6).
文摘Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occlusion,and limited labeled data.To address these challenges,we introduce a comprehensive methodology toenhance image classification and object detection accuracy.The proposed approach involves the integration ofmultiple methods in a complementary way.The process commences with the application of Gaussian filters tomitigate the impact of noise interference.These images are then processed for segmentation using Fuzzy C-Meanssegmentation in parallel with saliency mapping techniques to find the most prominent regions.The Binary RobustIndependent Elementary Features(BRIEF)characteristics are then extracted fromdata derived fromsaliency mapsand segmented images.For precise object separation,Oriented FAST and Rotated BRIEF(ORB)algorithms areemployed.Genetic Algorithms(GAs)are used to optimize Random Forest classifier parameters which lead toimproved performance.Our method stands out due to its comprehensive approach,adeptly addressing challengessuch as changing backdrops,occlusion,and limited labeled data concurrently.A significant enhancement hasbeen achieved by integrating Genetic Algorithms(GAs)to precisely optimize parameters.This minor adjustmentnot only boosts the uniqueness of our system but also amplifies its overall efficacy.The proposed methodologyhas demonstrated notable classification accuracies of 90.9%and 89.0%on the challenging Corel-1k and MSRCdatasets,respectively.Furthermore,detection accuracies of 87.2%and 86.6%have been attained.Although ourmethod performed well in both datasets it may face difficulties in real-world data especially where datasets havehighly complex backgrounds.Despite these limitations,GAintegration for parameter optimization shows a notablestrength in enhancing the overall adaptability and performance of our system.
基金supported by the NationalNatural Science Foundation of China Nos.62302167,U23A20343Shanghai Sailing Program(23YF1410500)Chenguang Program of Shanghai Education Development Foundation and Shanghai Municipal Education Commission(23CGA34).
文摘Confusing object detection(COD),such as glass,mirrors,and camouflaged objects,represents a burgeoning visual detection task centered on pinpointing and distinguishing concealed targets within intricate backgrounds,leveraging deep learning methodologies.Despite garnering increasing attention in computer vision,the focus of most existing works leans toward formulating task-specific solutions rather than delving into in-depth analyses of methodological structures.As of now,there is a notable absence of a comprehensive systematic review that focuses on recently proposed deep learning-based models for these specific tasks.To fill this gap,our study presents a pioneering review that covers both themodels and the publicly available benchmark datasets,while also identifying potential directions for future research in this field.The current dataset primarily focuses on single confusing object detection at the image level,with some studies extending to video-level data.We conduct an in-depth analysis of deep learning architectures,revealing that the current state-of-the-art(SOTA)COD methods demonstrate promising performance in single object detection.We also compile and provide detailed descriptions ofwidely used datasets relevant to these detection tasks.Our endeavor extends to discussing the limitations observed in current methodologies,alongside proposed solutions aimed at enhancing detection accuracy.Additionally,we deliberate on relevant applications and outline future research trajectories,aiming to catalyze advancements in the field of glass,mirror,and camouflaged object detection.
文摘Discovering floating wastes,especially bottles on water,is a crucial research problem in environmental hygiene.Nevertheless,real-world applications often face challenges such as interference from irrelevant objects and the high cost associated with data collection.Consequently,devising algorithms capable of accurately localizing specific objects within a scene in scenarios where annotated data is limited remains a formidable challenge.To solve this problem,this paper proposes an object discovery by request problem setting and a corresponding algorithmic framework.The proposed problem setting aims to identify specified objects in scenes,and the associated algorithmic framework comprises pseudo data generation and object discovery by request network.Pseudo-data generation generates images resembling natural scenes through various data augmentation rules,using a small number of object samples and scene images.The network structure of object discovery by request utilizes the pre-trained Vision Transformer(ViT)model as the backbone,employs object-centric methods to learn the latent representations of foreground objects,and applies patch-level reconstruction constraints to the model.During the validation phase,we use the generated pseudo datasets as training sets and evaluate the performance of our model on the original test sets.Experiments have proved that our method achieves state-of-the-art performance on Unmanned Aerial Vehicles-Bottle Detection(UAV-BD)dataset and self-constructed dataset Bottle,especially in multi-object scenarios.
文摘Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors.
基金supported by a grant from the National Key Research and Development Project(2023YFB4302100)Key Research and Development Project of Jiangxi Province(No.20232ACE01011)Independent Deployment Project of Ganjiang Innovation Research Institute,Chinese Academy of Sciences(E255J001).
文摘Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,we propose Rail-PillarNet,a three-dimensional(3D)LIDAR(Light Detection and Ranging)railway foreign object detection method based on the improvement of PointPillars.Firstly,the parallel attention pillar encoder(PAPE)is designed to fully extract the features of the pillars and alleviate the problem of local fine-grained information loss in PointPillars pillars encoder.Secondly,a fine backbone network is designed to improve the feature extraction capability of the network by combining the coding characteristics of LIDAR point cloud feature and residual structure.Finally,the initial weight parameters of the model were optimised by the transfer learning training method to further improve accuracy.The experimental results on the OSDaR23 dataset show that the average accuracy of Rail-PillarNet reaches 58.51%,which is higher than most mainstream models,and the number of parameters is 5.49 M.Compared with PointPillars,the accuracy of each target is improved by 10.94%,3.53%,16.96%and 19.90%,respectively,and the number of parameters only increases by 0.64M,which achieves a balance between the number of parameters and accuracy.
文摘The data analysis of blasting sites has always been the research goal of relevant researchers.The rise of mobile blasting robots has aroused many researchers’interest in machine learning methods for target detection in the field of blasting.Serverless Computing can provide a variety of computing services for people without hardware foundations and rich software development experience,which has aroused people’s interest in how to use it in the field ofmachine learning.In this paper,we design a distributedmachine learning training application based on the AWS Lambda platform.Based on data parallelism,the data aggregation and training synchronization in Function as a Service(FaaS)are effectively realized.It also encrypts the data set,effectively reducing the risk of data leakage.We rent a cloud server and a Lambda,and then we conduct experiments to evaluate our applications.Our results indicate the effectiveness,rapidity,and economy of distributed training on FaaS.
基金supported by the National Natural Science Foundation of China under Grant 62177029the Postgraduate Research&Practice Innovation Program of Jiangsu Province(KYCX21_0740),China.
文摘Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.
基金This work was partially supported by the National Natural Science Foundation of China(Grant Nos.61906168,U20A20171)Zhejiang Provincial Natural Science Foundation of China(Grant Nos.LY23F020023,LY21F020027)Construction of Hubei Provincial Key Laboratory for Intelligent Visual Monitoring of Hydropower Projects(Grant Nos.2022SDSJ01).
文摘In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,diagnosis and evaluation of kidney and urinary tract disease,providing insight into the specific type and severity.However,manual urine sediment examination is labor-intensive,time-consuming,and subjective.Traditional machine learning based object detection methods require hand-crafted features for localization and classification,which have poor generalization capabilities and are difficult to quickly and accurately detect the number of urine sediments.Deep learning based object detection methods have the potential to address the challenges mentioned above,but these methods require access to large urine sediment image datasets.Unfortunately,only a limited number of publicly available urine sediment datasets are currently available.To alleviate the lack of urine sediment datasets in medical image analysis,we propose a new dataset named UriSed2K,which contains 2465 high-quality images annotated with expert guidance.Two main challenges are associated with our dataset:a large number of small objects and the occlusion between these small objects.Our manuscript focuses on applying deep learning object detection methods to the urine sediment dataset and addressing the challenges presented by this dataset.Specifically,our goal is to improve the accuracy and efficiency of the detection algorithm and,in doing so,provide medical professionals with an automatic detector that saves time and effort.We propose an improved lightweight one-stage object detection algorithm called Discriminatory-YOLO.The proposed algorithm comprises a local context attention module and a global background suppression module,which aid the detector in distinguishing urine sediment features in the image.The local context attention module captures context information beyond the object region,while the global background suppression module emphasizes objects in uninformative backgrounds.We comprehensively evaluate our method on the UriSed2K dataset,which includes seven categories of urine sediments,such as erythrocytes(red blood cells),leukocytes(white blood cells),epithelial cells,crystals,mycetes,broken erythrocytes,and broken leukocytes,achieving the best average precision(AP)of 95.3%while taking only 10 ms per image.The source code and dataset are available at https://github.com/binghuiwu98/discriminatoryyolov5.
文摘Object detection finds wide application in various sectors,including autonomous driving,industry,and healthcare.Recent studies have highlighted the vulnerability of object detection models built using deep neural networks when confronted with carefully crafted adversarial examples.This not only reveals their shortcomings in defending against malicious attacks but also raises widespread concerns about the security of existing systems.Most existing adversarial attack strategies focus primarily on image classification problems,failing to fully exploit the unique characteristics of object detectionmodels,thus resulting in widespread deficiencies in their transferability.Furthermore,previous research has predominantly concentrated on the transferability issues of non-targeted attacks,whereas enhancing the transferability of targeted adversarial examples presents even greater challenges.Traditional attack techniques typically employ cross-entropy as a loss measure,iteratively adjusting adversarial examples to match target categories.However,their inherent limitations restrict their broad applicability and transferability across different models.To address the aforementioned challenges,this study proposes a novel targeted adversarial attack method aimed at enhancing the transferability of adversarial samples across object detection models.Within the framework of iterative attacks,we devise a new objective function designed to mitigate consistency issues arising from cumulative noise and to enhance the separation between target and non-target categories(logit margin).Secondly,a data augmentation framework incorporating random erasing and color transformations is introduced into targeted adversarial attacks.This enhances the diversity of gradients,preventing overfitting to white-box models.Lastly,perturbations are applied only within the specified object’s bounding box to reduce the perturbation range,enhancing attack stealthiness.Experiments were conducted on the Microsoft Common Objects in Context(MS COCO)dataset using You Only Look Once version 3(YOLOv3),You Only Look Once version 8(YOLOv8),Faster Region-based Convolutional Neural Networks(Faster R-CNN),and RetinaNet.The results demonstrate a significant advantage of the proposed method in black-box settings.Among these,the success rate of RetinaNet transfer attacks reached a maximum of 82.59%.
文摘Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of labelling.However,there is a large performance gap between weakly supervised and fully supervised salient object detectors because the scribble annotation can only provide very limited foreground/background information.Therefore,an intuitive idea is to infer annotations that cover more complete object and background regions for training.To this end,a label inference strategy is proposed based on the assumption that pixels with similar colours and close positions should have consistent labels.Specifically,k-means clustering algorithm was first performed on both colours and coordinates of original annotations,and then assigned the same labels to points having similar colours with colour cluster centres and near coordinate cluster centres.Next,the same annotations for pixels with similar colours within each kernel neighbourhood was set further.Extensive experiments on six benchmarks demonstrate that our method can significantly improve the performance and achieve the state-of-the-art results.
基金supported by theKorea Industrial Technology Association(KOITA)Grant Funded by the Korean government(MSIT)(No.KOITA-2023-3-003)supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)Support Program(IITP-2024-2020-0-01808)Supervised by the IITP(Institute of Information&Communications Technology Planning&Evaluation)。
文摘The advancement of navigation systems for the visually impaired has significantly enhanced their mobility by mitigating the risk of encountering obstacles and guiding them along safe,navigable routes.Traditional approaches primarily focus on broad applications such as wayfinding,obstacle detection,and fall prevention.However,there is a notable discrepancy in applying these technologies to more specific scenarios,like identifying distinct food crop types or recognizing faces.This study proposes a real-time application designed for visually impaired individuals,aiming to bridge this research-application gap.It introduces a system capable of detecting 20 different food crop types and recognizing faces with impressive accuracies of 83.27%and 95.64%,respectively.These results represent a significant contribution to the field of assistive technologies,providing visually impaired users with detailed and relevant information about their surroundings,thereby enhancing their mobility and ensuring their safety.Additionally,it addresses the vital aspects of social engagements,acknowledging the challenges faced by visually impaired individuals in recognizing acquaintances without auditory or tactile signals,and highlights recent developments in prototype systems aimed at assisting with face recognition tasks.This comprehensive approach not only promises enhanced navigational aids but also aims to enrich the social well-being and safety of visually impaired communities.
基金the Scientific Research Fund of Hunan Provincial Education Department(23A0423).
文摘Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.
基金the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2023GXJS163,ZDYF2024GXJS014)National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)+2 种基金the Major Science and Technology Project of Hainan Province(Grant No.ZDKJ2020012)Hainan Provincial Natural Science Foundation of China(Grant No.620MS021)Youth Foundation Project of Hainan Natural Science Foundation(621QN211).
文摘Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.
基金supported in part by the National Natural Science Foundation of China under Grant 62006071part by the Science and Technology Research Project of Henan Province under Grant 232103810086.
文摘In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in remote sensing remains a formidable challenge.The deep network structure will bring about the loss of object features,resulting in the loss of object features and the near elimination of some subtle features associated with small objects in deep layers.Additionally,the features of small objects are susceptible to interference from background features contained within the image,leading to a decline in detection accuracy.Moreover,the sensitivity of small objects to the bounding box perturbation further increases the detection difficulty.In this paper,we introduce a novel approach,Cross-Layer Fusion and Weighted Receptive Field-based YOLO(CAW-YOLO),specifically designed for small object detection in remote sensing.To address feature loss in deep layers,we have devised a cross-layer attention fusion module.Background noise is effectively filtered through the incorporation of Bi-Level Routing Attention(BRA).To enhance the model’s capacity to perceive multi-scale objects,particularly small-scale objects,we introduce a weightedmulti-receptive field atrous spatial pyramid poolingmodule.Furthermore,wemitigate the sensitivity arising from bounding box perturbation by incorporating the joint Normalized Wasserstein Distance(NWD)and Efficient Intersection over Union(EIoU)losses.The efficacy of the proposedmodel in detecting small objects in remote sensing has been validated through experiments conducted on three publicly available datasets.The experimental results unequivocally demonstrate the model’s pronounced advantages in small object detection for remote sensing,surpassing the performance of current mainstream models.