Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false...Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.展开更多
Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variati...Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.展开更多
The detection of ash content in coal slime flotation tailings using deep learning can be hindered by various factors such as foam,impurities,and changing lighting conditions that disrupt the collection of tailings ima...The detection of ash content in coal slime flotation tailings using deep learning can be hindered by various factors such as foam,impurities,and changing lighting conditions that disrupt the collection of tailings images.To address this challenge,we present a method for ash content detection in coal slime flotation tailings.This method utilizes chromatographic filter paper sampling and a multi-scale residual network,which we refer to as MRCN.Initially,tailings are sampled using chromatographic filter paper to obtain static tailings images,effectively isolating interference factors at the flotation site.Subsequently,the MRCN,consisting of a multi-scale residual network,is employed to extract image features and compute ash content.Within the MRCN structure,tailings images undergo convolution operations through two parallel branches that utilize convolution kernels of different sizes,enabling the extraction of image features at various scales and capturing a more comprehensive representation of the ash content information.Furthermore,a channel attention mechanism is integrated to enhance the performance of the model.The combination of the multi-scale residual structure and the channel attention mechanism within MRCN results in robust capabilities for image feature extraction and ash content detection.Comparative experiments demonstrate that this proposed approach,based on chromatographic filter paper sampling and the multi-scale residual network,exhibits significantly superior performance in the detection of ash content in coal slime flotation tailings.展开更多
Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. Th...Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.展开更多
In response to the challenge of low detection accuracy and susceptibility to missed and false detections of small targets in unmanned aerial vehicles(UAVs)aerial images,an improved UAV image target detection algorithm...In response to the challenge of low detection accuracy and susceptibility to missed and false detections of small targets in unmanned aerial vehicles(UAVs)aerial images,an improved UAV image target detection algorithm based on YOLOv8 was proposed in this study.To begin with,the CoordAtt attention mechanism was employed to enhance the feature extraction capability of the backbone network,thereby reducing interference from backgrounds.Additionally,the BiFPN feature fusion network with an added small object detection layer was used to enhance the model's ability to perceive for small objects.Furthermore,a multi-level fusion module was designed and proposed to effectively integrate shallow and deep information.The use of an enhanced MPDIoU loss function further improved detection performance.The experimental results based on the publicly available VisDrone2019 dataset showed that the improved model outperformed the YOLOv8 baseline model,mAP@0.5 improved by 20%,and the improved method improved the detection accuracy of the model for small targets.展开更多
With the remarkable advancements in machine vision research and its ever-expanding applications,scholars have increasingly focused on harnessing various vision methodologies within the industrial realm.Specifically,de...With the remarkable advancements in machine vision research and its ever-expanding applications,scholars have increasingly focused on harnessing various vision methodologies within the industrial realm.Specifically,detecting vehicle floor welding points poses unique challenges,including high operational costs and limited portability in practical settings.To address these challenges,this paper innovatively integrates template matching and the Faster RCNN algorithm,presenting an industrial fusion cascaded solder joint detection algorithm that seamlessly blends template matching with deep learning techniques.This algorithm meticulously weights and fuses the optimized features of both methodologies,enhancing the overall detection capabilities.Furthermore,it introduces an optimized multi-scale and multi-template matching approach,leveraging a diverse array of templates and image pyramid algorithms to bolster the accuracy and resilience of object detection.By integrating deep learning algorithms with this multi-scale and multi-template matching strategy,the cascaded target matching algorithm effectively accurately identifies solder joint types and positions.A comprehensive welding point dataset,labeled by experts specifically for vehicle detection,was constructed based on images from authentic industrial environments to validate the algorithm’s performance.Experiments demonstrate the algorithm’s compelling performance in industrial scenarios,outperforming the single-template matching algorithm by 21.3%,the multi-scale and multitemplate matching algorithm by 3.4%,the Faster RCNN algorithm by 19.7%,and the YOLOv9 algorithm by 17.3%in terms of solder joint detection accuracy.This optimized algorithm exhibits remarkable robustness and portability,ideally suited for detecting solder joints across diverse vehicle workpieces.Notably,this study’s dataset and feature fusion approach can be a valuable resource for other algorithms seeking to enhance their solder joint detection capabilities.This work thus not only presents a novel and effective solution for industrial solder joint detection but lays the groundwork for future advancements in this critical area.展开更多
Detection of floating garbage in inland rivers is crucial for water environmental protection,as it effectively reduces ecological damage and ensures the safety of water resources.To address the inefficiency of traditi...Detection of floating garbage in inland rivers is crucial for water environmental protection,as it effectively reduces ecological damage and ensures the safety of water resources.To address the inefficiency of traditional cleanup methods and the challenges in detecting small targets,an improved YOLOv5 object detection model was proposed in this study.In order to enhance the model’s sensitivity to small targets and mitigate the impact of redundant information on detection performance,a bi-level routing attention mechanism was introduced and embedded into the backbone network.Additionally,a multi-scale detection head was incorporated into the model,allowing for more comprehensive coverage of floating garbage of various sizes through multi-scale feature extraction and detection.The Focal-EIoU loss function was also employed to optimize the model parameters,improving localization accuracy.Experimental results on the publicly available FloW_Img dataset demonstrated that the improved YOLOv5 model outperforms the original YOLOv5 model in terms of precision and recall,achieving a mAP(mean average precision)of 86.12%,with significant improvements and faster convergence.展开更多
Breast cancer is a significant threat to the global population,affecting not only women but also a threat to the entire population.With recent advancements in digital pathology,Eosin and hematoxylin images provide enh...Breast cancer is a significant threat to the global population,affecting not only women but also a threat to the entire population.With recent advancements in digital pathology,Eosin and hematoxylin images provide enhanced clarity in examiningmicroscopic features of breast tissues based on their staining properties.Early cancer detection facilitates the quickening of the therapeutic process,thereby increasing survival rates.The analysis made by medical professionals,especially pathologists,is time-consuming and challenging,and there arises a need for automated breast cancer detection systems.The upcoming artificial intelligence platforms,especially deep learning models,play an important role in image diagnosis and prediction.Initially,the histopathology biopsy images are taken from standard data sources.Further,the gathered images are given as input to the Multi-Scale Dilated Vision Transformer,where the essential features are acquired.Subsequently,the features are subjected to the Bidirectional Long Short-Term Memory(Bi-LSTM)for classifying the breast cancer disorder.The efficacy of the model is evaluated using divergent metrics.When compared with other methods,the proposed work reveals that it offers impressive results for detection.展开更多
Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments...Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments in deep learning(DL)and computer vision(CV)techniques enable the design of automated face recognition and tracking methods.This study presents a novel Harris Hawks Optimization with deep learning-empowered automated face detection and tracking(HHODL-AFDT)method.The proposed HHODL-AFDT model involves a Faster region based convolution neural network(RCNN)-based face detection model and HHO-based hyperparameter opti-mization process.The presented optimal Faster RCNN model precisely rec-ognizes the face and is passed into the face-tracking model using a regression network(REGN).The face tracking using the REGN model uses the fea-tures from neighboring frames and foresees the location of the target face in succeeding frames.The application of the HHO algorithm for optimal hyperparameter selection shows the novelty of the work.The experimental validation of the presented HHODL-AFDT algorithm is conducted using two datasets and the experiment outcomes highlighted the superior performance of the HHODL-AFDT model over current methodologies with maximum accuracy of 90.60%and 88.08%under PICS and VTB datasets,respectively.展开更多
A face-mask object detection model incorporating hybrid dilation convolutional network termed ResNet Hybrid-dilation-convolution Face-mask-detector (RHF) is proposed in this paper. Furthermore, a lightweight face-mask...A face-mask object detection model incorporating hybrid dilation convolutional network termed ResNet Hybrid-dilation-convolution Face-mask-detector (RHF) is proposed in this paper. Furthermore, a lightweight face-mask dataset named Light Masked Face Dataset (LMFD) and a medium-sized face-mask dataset named Masked Face Dataset (MFD) with data augmentation methods applied is also constructed in this paper. The hybrid dilation convolutional network is able to expand the perception of the convolutional kernel without concern about the discontinuity of image information during the convolution process. For the given two datasets being constructed above, the trained models are significantly optimized in terms of detection performance, training time, and other related metrics. By using the MFD dataset of 55,905 images, the RHF model requires roughly 10 hours less training time compared to ResNet50 with better detection results with mAP of 93.45%.展开更多
Aiming at the problem of low accuracy of traditional target detection methods for target detection in endoscopes in substation environments, a CNN-based real-time detection method for masked targets is proposed. The m...Aiming at the problem of low accuracy of traditional target detection methods for target detection in endoscopes in substation environments, a CNN-based real-time detection method for masked targets is proposed. The method adopts the overall design of backbone network, detection network and algorithmic parameter optimisation method, completes the model training on the self-constructed occlusion target dataset, and adopts the multi-scale perception method for target detection. The HNM algorithm is used to screen positive and negative samples during the training process, and the NMS algorithm is used to post-process the prediction results during the detection process to improve the detection efficiency. After experimental validation, the obtained model has the multi-class average predicted value (mAP) of the dataset. It has general advantages over traditional target detection methods. The detection time of a single target on FDDB dataset is 39 ms, which can meet the need of real-time target detection. In addition, the project team has successfully deployed the method into substations and put it into use in many places in Beijing, which is important for achieving the anomaly of occlusion target detection.展开更多
Recent security applications in mobile technologies and computer sys-tems use face recognition for high-end security.Despite numerous security tech-niques,face recognition is considered a high-security control.Develop...Recent security applications in mobile technologies and computer sys-tems use face recognition for high-end security.Despite numerous security tech-niques,face recognition is considered a high-security control.Developers fuse and carry out face identification as an access authority into these applications.Still,face identification authentication is sensitive to attacks with a 2-D photo image or captured video to access the system as an authorized user.In the existing spoofing detection algorithm,there was some loss in the recreation of images.This research proposes an unobtrusive technique to detect face spoofing attacks that apply a single frame of the sequenced set of frames to overcome the above-said problems.This research offers a novel Edge-Net autoencoder to select convoluted and dominant features of the input diffused structure.First,this pro-posed method is tested with the Cross-ethnicity Face Anti-spoofing(CASIA),Fetal alcohol spectrum disorders(FASD)dataset.This database has three models of attacks:distorted photographs in printed form,photographs with removed eyes portion,and video attacks.The images are taken with three different quality cameras:low,average,and high-quality real and spoofed images.An extensive experimental study was performed with CASIA-FASD,3 Diagnostic Machine Aid-Digital(DMAD)dataset that proved higher results when compared to existing algorithms.展开更多
Face mask detection has several applications,including real-time surveillance,biometrics,etc.Identifying face masks is also helpful for crowd control and ensuring people wear them publicly.With monitoring personnel,it...Face mask detection has several applications,including real-time surveillance,biometrics,etc.Identifying face masks is also helpful for crowd control and ensuring people wear them publicly.With monitoring personnel,it is impossible to ensure that people wear face masks;automated systems are a much superior option for face mask detection and monitoring.This paper introduces a simple and efficient approach for masked face detection.The architecture of the proposed approach is very straightforward;it combines deep learning and local binary patterns to extract features and classify themasmasked or unmasked.The proposed systemrequires hardware withminimal power consumption compared to state-of-the-art deep learning algorithms.Our proposed system maintains two steps.At first,this work extracted the local features of an image by using a local binary pattern descriptor,and then we used deep learning to extract global features.The proposed approach has achieved excellent accuracy and high performance.The performance of the proposed method was tested on three benchmark datasets:the realworld masked faces dataset(RMFD),the simulated masked faces dataset(SMFD),and labeled faces in the wild(LFW).Performancemetrics for the proposed technique weremeasured in terms of accuracy,precision,recall,and F1-score.Results indicated the efficiency of the proposed technique,providing accuracies of 99.86%,99.98%,and 100%for RMFD,SMFD,and LFW,respectively.Moreover,the proposed method outperformed state-of-the-art deep learning methods in the recent bibliography for the same problem under study and on the same evaluation datasets.展开更多
Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firs...Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firstly,considering the large-scale variation and camouflage of armored target,a new MS-RN integrating contextual information in battlefield environment is designed.The MS-RN extracts deep features from templates with different scales and strengthens the detection ability of small targets.Armored targets of different sizes are detected on different representation features.Secondly,aiming at the accuracy and real-time detection requirements,improved shape-fixed Guided Anchor is used on feature maps of different scales to recommend regions of interests(ROIs).Different from sliding or random anchor,the SF-GA can filter out 80% of the regions while still improving the recall.A special detection dataset for armored target,named Armored Target Dataset(ARTD),is constructed,based on which the comparable experiments with state-of-art detection methods are conducted.Experimental results show that the proposed method achieves outstanding performance in detection accuracy and efficiency,especially when small armored targets are involved.展开更多
This paper proposes a multi-scale self-recovery(MSSR)approach to protect images against content forgery.The main idea is to provide more resistance against image tampering while enabling the recovery process in a mult...This paper proposes a multi-scale self-recovery(MSSR)approach to protect images against content forgery.The main idea is to provide more resistance against image tampering while enabling the recovery process in a multi-scale quality manner.In the proposed approach,the reference data composed of several parts and each part is protected by a channel coding rate according to its importance.The first part,which is used to reconstruct a rough approximation of the original image,is highly protected in order to resist against higher tampering rates.Other parts are protected with lower rates according to their importance leading to lower tolerable tampering rate(TTR),but the higher quality of the recovered images.The proposed MSSR approach is an efficient solution for the main disadvantage of the current methods,which either recover a tampered image in low tampering rates or fails when tampering rate is above the TTR value.The simulation results on 10000 test images represent the efficiency of the multi-scale self-recovery feature of the proposed approach in comparison with the existing methods.展开更多
This paper introduces a multi-scale morphological edge detection algorithm to extract SAR image edge which suffers seriously from noise. Combining the basic theme of morphology with that of multi-scale analysis, the a...This paper introduces a multi-scale morphological edge detection algorithm to extract SAR image edge which suffers seriously from noise. Combining the basic theme of morphology with that of multi-scale analysis, the algorithm presents the outstanding characteristics of accuracy and robustness. Comparative Experiments reveal its fine performance.展开更多
Inspired by the coarse-to-fine visual perception process of human vision system,a new approach based on Gaussian multi-scale space for defect detection of industrial products was proposed.By selecting different scale ...Inspired by the coarse-to-fine visual perception process of human vision system,a new approach based on Gaussian multi-scale space for defect detection of industrial products was proposed.By selecting different scale parameters of the Gaussian kernel,the multi-scale representation of the original image data could be obtained and used to constitute the multi- variate image,in which each channel could represent a perceptual observation of the original image from different scales.The Multivariate Image Analysis (MIA) techniques were used to extract defect features information.The MIA combined Principal Component Analysis (PCA) to obtain the principal component scores of the multivariate test image.The Q-statistic image, derived from the residuals after the extraction of the first principal component score and noise,could be used to efficiently reveal the surface defects with an appropriate threshold value decided by training images.Experimental results show that the proposed method performs better than the gray histogram-based method.It has less sensitivity to the inhomogeneous of illumination,and has more robustness and reliability of defect detection with lower pseudo reject rate.展开更多
人脸识别技术广泛应用于考勤管理、移动支付等智慧建设中。伴随着常态化的口罩干扰,传统人脸识别算法已无法满足实际应用需求,为此,本文利用深度学习模型SSD以及FaceNet模型对人脸识别系统展开设计。首先,为消除现有数据集中亚洲人脸占...人脸识别技术广泛应用于考勤管理、移动支付等智慧建设中。伴随着常态化的口罩干扰,传统人脸识别算法已无法满足实际应用需求,为此,本文利用深度学习模型SSD以及FaceNet模型对人脸识别系统展开设计。首先,为消除现有数据集中亚洲人脸占比小造成的类内间距变化差距不明显的问题,在CAS-IA Web Face公开数据集的基础上对亚洲人脸数据进行扩充;其次,为解决不同口罩样式对特征提取的干扰,使用SSD人脸检测模型与DLIB人脸关键点检测模型提取人脸关键点,并利用人脸关键点与口罩的空间位置关系,额外随机生成不同的口罩人脸,组成混合数据集;最后,在混合数据集上进行模型训练并将训练好的模型移植到人脸识别系统中,进行检测速度与识别精度验证。实验结果表明,系统的实时识别速度达20 fps以上,人脸识别模型准确率在构建的混合数据集中达到97.1%,在随机抽取的部分LFW数据集验证的准确率达99.7%,故而该系统可满足实际应用需求,在一定程度上提高人脸识别的鲁棒性与准确性。展开更多
Heart rate is an important vital characteristic which indicates physical and mental health status.Typically heart rate measurement instruments require direct contact with the skin which is time-consuming and costly.Th...Heart rate is an important vital characteristic which indicates physical and mental health status.Typically heart rate measurement instruments require direct contact with the skin which is time-consuming and costly.Therefore,the study of non-contact heart rate measurement methods is of great importance.Based on the principles of photoelectric volumetric tracing,we use a computer device and camera to capture facial images,accurately detect face regions,and to detect multiple facial images using a multi-target tracking algorithm.Then after the regional segmentation of the facial image,the signal acquisition of the region of interest is further resolved.Finally,frequency detection of the collected Photo-plethysmography(PPG)and Electrocardiography(ECG)signals is completed with peak detection,Fourier analysis,and a Waveletfilter.The experimental results show that the subject’s heart rate can be detected quickly and accurately even when monitoring multiple facial targets simultaneously.展开更多
基金the Scientific Research Fund of Hunan Provincial Education Department(23A0423).
文摘Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.
基金the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2023GXJS163,ZDYF2024GXJS014)National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)+2 种基金the Major Science and Technology Project of Hainan Province(Grant No.ZDKJ2020012)Hainan Provincial Natural Science Foundation of China(Grant No.620MS021)Youth Foundation Project of Hainan Natural Science Foundation(621QN211).
文摘Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.
基金This work was supported by National Natural Science Foundation of China:Grant No.62106048.
文摘The detection of ash content in coal slime flotation tailings using deep learning can be hindered by various factors such as foam,impurities,and changing lighting conditions that disrupt the collection of tailings images.To address this challenge,we present a method for ash content detection in coal slime flotation tailings.This method utilizes chromatographic filter paper sampling and a multi-scale residual network,which we refer to as MRCN.Initially,tailings are sampled using chromatographic filter paper to obtain static tailings images,effectively isolating interference factors at the flotation site.Subsequently,the MRCN,consisting of a multi-scale residual network,is employed to extract image features and compute ash content.Within the MRCN structure,tailings images undergo convolution operations through two parallel branches that utilize convolution kernels of different sizes,enabling the extraction of image features at various scales and capturing a more comprehensive representation of the ash content information.Furthermore,a channel attention mechanism is integrated to enhance the performance of the model.The combination of the multi-scale residual structure and the channel attention mechanism within MRCN results in robust capabilities for image feature extraction and ash content detection.Comparative experiments demonstrate that this proposed approach,based on chromatographic filter paper sampling and the multi-scale residual network,exhibits significantly superior performance in the detection of ash content in coal slime flotation tailings.
文摘Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.
文摘In response to the challenge of low detection accuracy and susceptibility to missed and false detections of small targets in unmanned aerial vehicles(UAVs)aerial images,an improved UAV image target detection algorithm based on YOLOv8 was proposed in this study.To begin with,the CoordAtt attention mechanism was employed to enhance the feature extraction capability of the backbone network,thereby reducing interference from backgrounds.Additionally,the BiFPN feature fusion network with an added small object detection layer was used to enhance the model's ability to perceive for small objects.Furthermore,a multi-level fusion module was designed and proposed to effectively integrate shallow and deep information.The use of an enhanced MPDIoU loss function further improved detection performance.The experimental results based on the publicly available VisDrone2019 dataset showed that the improved model outperformed the YOLOv8 baseline model,mAP@0.5 improved by 20%,and the improved method improved the detection accuracy of the model for small targets.
基金supported in part by the National Key Research Project of China under Grant No.2023YFA1009402General Science and Technology Plan Items in Zhejiang Province ZJKJT-2023-02.
文摘With the remarkable advancements in machine vision research and its ever-expanding applications,scholars have increasingly focused on harnessing various vision methodologies within the industrial realm.Specifically,detecting vehicle floor welding points poses unique challenges,including high operational costs and limited portability in practical settings.To address these challenges,this paper innovatively integrates template matching and the Faster RCNN algorithm,presenting an industrial fusion cascaded solder joint detection algorithm that seamlessly blends template matching with deep learning techniques.This algorithm meticulously weights and fuses the optimized features of both methodologies,enhancing the overall detection capabilities.Furthermore,it introduces an optimized multi-scale and multi-template matching approach,leveraging a diverse array of templates and image pyramid algorithms to bolster the accuracy and resilience of object detection.By integrating deep learning algorithms with this multi-scale and multi-template matching strategy,the cascaded target matching algorithm effectively accurately identifies solder joint types and positions.A comprehensive welding point dataset,labeled by experts specifically for vehicle detection,was constructed based on images from authentic industrial environments to validate the algorithm’s performance.Experiments demonstrate the algorithm’s compelling performance in industrial scenarios,outperforming the single-template matching algorithm by 21.3%,the multi-scale and multitemplate matching algorithm by 3.4%,the Faster RCNN algorithm by 19.7%,and the YOLOv9 algorithm by 17.3%in terms of solder joint detection accuracy.This optimized algorithm exhibits remarkable robustness and portability,ideally suited for detecting solder joints across diverse vehicle workpieces.Notably,this study’s dataset and feature fusion approach can be a valuable resource for other algorithms seeking to enhance their solder joint detection capabilities.This work thus not only presents a novel and effective solution for industrial solder joint detection but lays the groundwork for future advancements in this critical area.
文摘Detection of floating garbage in inland rivers is crucial for water environmental protection,as it effectively reduces ecological damage and ensures the safety of water resources.To address the inefficiency of traditional cleanup methods and the challenges in detecting small targets,an improved YOLOv5 object detection model was proposed in this study.In order to enhance the model’s sensitivity to small targets and mitigate the impact of redundant information on detection performance,a bi-level routing attention mechanism was introduced and embedded into the backbone network.Additionally,a multi-scale detection head was incorporated into the model,allowing for more comprehensive coverage of floating garbage of various sizes through multi-scale feature extraction and detection.The Focal-EIoU loss function was also employed to optimize the model parameters,improving localization accuracy.Experimental results on the publicly available FloW_Img dataset demonstrated that the improved YOLOv5 model outperforms the original YOLOv5 model in terms of precision and recall,achieving a mAP(mean average precision)of 86.12%,with significant improvements and faster convergence.
基金Deanship of Research and Graduate Studies at King Khalid University for funding this work through Small Group Research Project under Grant Number RGP1/261/45.
文摘Breast cancer is a significant threat to the global population,affecting not only women but also a threat to the entire population.With recent advancements in digital pathology,Eosin and hematoxylin images provide enhanced clarity in examiningmicroscopic features of breast tissues based on their staining properties.Early cancer detection facilitates the quickening of the therapeutic process,thereby increasing survival rates.The analysis made by medical professionals,especially pathologists,is time-consuming and challenging,and there arises a need for automated breast cancer detection systems.The upcoming artificial intelligence platforms,especially deep learning models,play an important role in image diagnosis and prediction.Initially,the histopathology biopsy images are taken from standard data sources.Further,the gathered images are given as input to the Multi-Scale Dilated Vision Transformer,where the essential features are acquired.Subsequently,the features are subjected to the Bidirectional Long Short-Term Memory(Bi-LSTM)for classifying the breast cancer disorder.The efficacy of the model is evaluated using divergent metrics.When compared with other methods,the proposed work reveals that it offers impressive results for detection.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R349)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.This study is supported via funding from Prince Sattam bin Abdulaziz University Project Number(PSAU/2023/R/1444).
文摘Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments in deep learning(DL)and computer vision(CV)techniques enable the design of automated face recognition and tracking methods.This study presents a novel Harris Hawks Optimization with deep learning-empowered automated face detection and tracking(HHODL-AFDT)method.The proposed HHODL-AFDT model involves a Faster region based convolution neural network(RCNN)-based face detection model and HHO-based hyperparameter opti-mization process.The presented optimal Faster RCNN model precisely rec-ognizes the face and is passed into the face-tracking model using a regression network(REGN).The face tracking using the REGN model uses the fea-tures from neighboring frames and foresees the location of the target face in succeeding frames.The application of the HHO algorithm for optimal hyperparameter selection shows the novelty of the work.The experimental validation of the presented HHODL-AFDT algorithm is conducted using two datasets and the experiment outcomes highlighted the superior performance of the HHODL-AFDT model over current methodologies with maximum accuracy of 90.60%and 88.08%under PICS and VTB datasets,respectively.
文摘A face-mask object detection model incorporating hybrid dilation convolutional network termed ResNet Hybrid-dilation-convolution Face-mask-detector (RHF) is proposed in this paper. Furthermore, a lightweight face-mask dataset named Light Masked Face Dataset (LMFD) and a medium-sized face-mask dataset named Masked Face Dataset (MFD) with data augmentation methods applied is also constructed in this paper. The hybrid dilation convolutional network is able to expand the perception of the convolutional kernel without concern about the discontinuity of image information during the convolution process. For the given two datasets being constructed above, the trained models are significantly optimized in terms of detection performance, training time, and other related metrics. By using the MFD dataset of 55,905 images, the RHF model requires roughly 10 hours less training time compared to ResNet50 with better detection results with mAP of 93.45%.
文摘Aiming at the problem of low accuracy of traditional target detection methods for target detection in endoscopes in substation environments, a CNN-based real-time detection method for masked targets is proposed. The method adopts the overall design of backbone network, detection network and algorithmic parameter optimisation method, completes the model training on the self-constructed occlusion target dataset, and adopts the multi-scale perception method for target detection. The HNM algorithm is used to screen positive and negative samples during the training process, and the NMS algorithm is used to post-process the prediction results during the detection process to improve the detection efficiency. After experimental validation, the obtained model has the multi-class average predicted value (mAP) of the dataset. It has general advantages over traditional target detection methods. The detection time of a single target on FDDB dataset is 39 ms, which can meet the need of real-time target detection. In addition, the project team has successfully deployed the method into substations and put it into use in many places in Beijing, which is important for achieving the anomaly of occlusion target detection.
文摘Recent security applications in mobile technologies and computer sys-tems use face recognition for high-end security.Despite numerous security tech-niques,face recognition is considered a high-security control.Developers fuse and carry out face identification as an access authority into these applications.Still,face identification authentication is sensitive to attacks with a 2-D photo image or captured video to access the system as an authorized user.In the existing spoofing detection algorithm,there was some loss in the recreation of images.This research proposes an unobtrusive technique to detect face spoofing attacks that apply a single frame of the sequenced set of frames to overcome the above-said problems.This research offers a novel Edge-Net autoencoder to select convoluted and dominant features of the input diffused structure.First,this pro-posed method is tested with the Cross-ethnicity Face Anti-spoofing(CASIA),Fetal alcohol spectrum disorders(FASD)dataset.This database has three models of attacks:distorted photographs in printed form,photographs with removed eyes portion,and video attacks.The images are taken with three different quality cameras:low,average,and high-quality real and spoofed images.An extensive experimental study was performed with CASIA-FASD,3 Diagnostic Machine Aid-Digital(DMAD)dataset that proved higher results when compared to existing algorithms.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number (PNURSP2023R442),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia。
文摘Face mask detection has several applications,including real-time surveillance,biometrics,etc.Identifying face masks is also helpful for crowd control and ensuring people wear them publicly.With monitoring personnel,it is impossible to ensure that people wear face masks;automated systems are a much superior option for face mask detection and monitoring.This paper introduces a simple and efficient approach for masked face detection.The architecture of the proposed approach is very straightforward;it combines deep learning and local binary patterns to extract features and classify themasmasked or unmasked.The proposed systemrequires hardware withminimal power consumption compared to state-of-the-art deep learning algorithms.Our proposed system maintains two steps.At first,this work extracted the local features of an image by using a local binary pattern descriptor,and then we used deep learning to extract global features.The proposed approach has achieved excellent accuracy and high performance.The performance of the proposed method was tested on three benchmark datasets:the realworld masked faces dataset(RMFD),the simulated masked faces dataset(SMFD),and labeled faces in the wild(LFW).Performancemetrics for the proposed technique weremeasured in terms of accuracy,precision,recall,and F1-score.Results indicated the efficiency of the proposed technique,providing accuracies of 99.86%,99.98%,and 100%for RMFD,SMFD,and LFW,respectively.Moreover,the proposed method outperformed state-of-the-art deep learning methods in the recent bibliography for the same problem under study and on the same evaluation datasets.
基金supported by the National Key Research and Development Program of China under grant 2016YFC0802904National Natural Science Foundation of China under grant61671470the Postdoctoral Science Foundation Funded Project of China under grant 2017M623423。
文摘Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firstly,considering the large-scale variation and camouflage of armored target,a new MS-RN integrating contextual information in battlefield environment is designed.The MS-RN extracts deep features from templates with different scales and strengthens the detection ability of small targets.Armored targets of different sizes are detected on different representation features.Secondly,aiming at the accuracy and real-time detection requirements,improved shape-fixed Guided Anchor is used on feature maps of different scales to recommend regions of interests(ROIs).Different from sliding or random anchor,the SF-GA can filter out 80% of the regions while still improving the recall.A special detection dataset for armored target,named Armored Target Dataset(ARTD),is constructed,based on which the comparable experiments with state-of-art detection methods are conducted.Experimental results show that the proposed method achieves outstanding performance in detection accuracy and efficiency,especially when small armored targets are involved.
文摘This paper proposes a multi-scale self-recovery(MSSR)approach to protect images against content forgery.The main idea is to provide more resistance against image tampering while enabling the recovery process in a multi-scale quality manner.In the proposed approach,the reference data composed of several parts and each part is protected by a channel coding rate according to its importance.The first part,which is used to reconstruct a rough approximation of the original image,is highly protected in order to resist against higher tampering rates.Other parts are protected with lower rates according to their importance leading to lower tolerable tampering rate(TTR),but the higher quality of the recovered images.The proposed MSSR approach is an efficient solution for the main disadvantage of the current methods,which either recover a tampered image in low tampering rates or fails when tampering rate is above the TTR value.The simulation results on 10000 test images represent the efficiency of the multi-scale self-recovery feature of the proposed approach in comparison with the existing methods.
基金Supported the NatioIlal Naturel Science Foundation of China(No.69831040)
文摘This paper introduces a multi-scale morphological edge detection algorithm to extract SAR image edge which suffers seriously from noise. Combining the basic theme of morphology with that of multi-scale analysis, the algorithm presents the outstanding characteristics of accuracy and robustness. Comparative Experiments reveal its fine performance.
基金supported in part by the Natural Science Foundation of China (NSFC) (Grant No:50875240).
文摘Inspired by the coarse-to-fine visual perception process of human vision system,a new approach based on Gaussian multi-scale space for defect detection of industrial products was proposed.By selecting different scale parameters of the Gaussian kernel,the multi-scale representation of the original image data could be obtained and used to constitute the multi- variate image,in which each channel could represent a perceptual observation of the original image from different scales.The Multivariate Image Analysis (MIA) techniques were used to extract defect features information.The MIA combined Principal Component Analysis (PCA) to obtain the principal component scores of the multivariate test image.The Q-statistic image, derived from the residuals after the extraction of the first principal component score and noise,could be used to efficiently reveal the surface defects with an appropriate threshold value decided by training images.Experimental results show that the proposed method performs better than the gray histogram-based method.It has less sensitivity to the inhomogeneous of illumination,and has more robustness and reliability of defect detection with lower pseudo reject rate.
文摘人脸识别技术广泛应用于考勤管理、移动支付等智慧建设中。伴随着常态化的口罩干扰,传统人脸识别算法已无法满足实际应用需求,为此,本文利用深度学习模型SSD以及FaceNet模型对人脸识别系统展开设计。首先,为消除现有数据集中亚洲人脸占比小造成的类内间距变化差距不明显的问题,在CAS-IA Web Face公开数据集的基础上对亚洲人脸数据进行扩充;其次,为解决不同口罩样式对特征提取的干扰,使用SSD人脸检测模型与DLIB人脸关键点检测模型提取人脸关键点,并利用人脸关键点与口罩的空间位置关系,额外随机生成不同的口罩人脸,组成混合数据集;最后,在混合数据集上进行模型训练并将训练好的模型移植到人脸识别系统中,进行检测速度与识别精度验证。实验结果表明,系统的实时识别速度达20 fps以上,人脸识别模型准确率在构建的混合数据集中达到97.1%,在随机抽取的部分LFW数据集验证的准确率达99.7%,故而该系统可满足实际应用需求,在一定程度上提高人脸识别的鲁棒性与准确性。
基金supported by the National Nature Science Foundation of China(Grant Number:61962010).
文摘Heart rate is an important vital characteristic which indicates physical and mental health status.Typically heart rate measurement instruments require direct contact with the skin which is time-consuming and costly.Therefore,the study of non-contact heart rate measurement methods is of great importance.Based on the principles of photoelectric volumetric tracing,we use a computer device and camera to capture facial images,accurately detect face regions,and to detect multiple facial images using a multi-target tracking algorithm.Then after the regional segmentation of the facial image,the signal acquisition of the region of interest is further resolved.Finally,frequency detection of the collected Photo-plethysmography(PPG)and Electrocardiography(ECG)signals is completed with peak detection,Fourier analysis,and a Waveletfilter.The experimental results show that the subject’s heart rate can be detected quickly and accurately even when monitoring multiple facial targets simultaneously.