In this paper,an improved spatio-temporal alignment measurement method is presented to address the inertial matching measurement of hull deformation under the coexistence of time delay and large misalignment angle.Lar...In this paper,an improved spatio-temporal alignment measurement method is presented to address the inertial matching measurement of hull deformation under the coexistence of time delay and large misalignment angle.Large misalignment angle and time delay often occur simultaneously and bring great challenges to the accurate measurement of hull deformation in space and time.The proposed method utilizes coarse alignment with large misalignment angle and time delay estimation of inertial measurement unit modeling to establish a brand-new spatiotemporal aligned hull deformation measurement model.In addition,two-step loop control is designed to ensure the accurate description of dynamic deformation angle and static deformation angle by the time-space alignment method of hull deformation.The experiments illustrate that the proposed method can effectively measure the hull deformation angle when time delay and large misalignment angle coexist.展开更多
Due to the time-varying topology and possible disturbances in a conflict environment,it is still challenging to maintain the mission performance of flying Ad hoc networks(FANET),which limits the application of Unmanne...Due to the time-varying topology and possible disturbances in a conflict environment,it is still challenging to maintain the mission performance of flying Ad hoc networks(FANET),which limits the application of Unmanned Aerial Vehicle(UAV)swarms in harsh environments.This paper proposes an intelligent framework to quickly recover the cooperative coveragemission by aggregating the historical spatio-temporal network with the attention mechanism.The mission resilience metric is introduced in conjunction with connectivity and coverage status information to simplify the optimization model.A spatio-temporal node pooling method is proposed to ensure all node location features can be updated after destruction by capturing the temporal network structure.Combined with the corresponding Laplacian matrix as the hyperparameter,a recovery algorithm based on the multi-head attention graph network is designed to achieve rapid recovery.Simulation results showed that the proposed framework can facilitate rapid recovery of the connectivity and coverage more effectively compared to the existing studies.The results demonstrate that the average connectivity and coverage results is improved by 17.92%and 16.96%,respectively compared with the state-of-the-art model.Furthermore,by the ablation study,the contributions of each different improvement are compared.The proposed model can be used to support resilient network design for real-time mission execution.展开更多
Decline in wildlife populations is manifest globally, regionally and locally. A wildlife decline of 68% has been reported in Kenya’s rangelands with Baringo County experiencing more than 85% wildlife loss in the last...Decline in wildlife populations is manifest globally, regionally and locally. A wildlife decline of 68% has been reported in Kenya’s rangelands with Baringo County experiencing more than 85% wildlife loss in the last four decades. Greater Kudu (Tragelaphus strepsiceros) is endemic to Lake Bogoria landscape in Baringo County and constitutes a major tourist attraction for the region necessitating use of its photo on the County’s logo and thus a flagship species. Tourism plays a central role in Baringo County’s economy and is a major source of potential growth and employment creation. The study was carried out to assess spatio-temporal change of dispersal areas of Greater Kudu (GK) in Lake Bogoria landscape in the last four years for enhanced adaptive management and improved livelihoods. GK population distribution primary data collected in December 2022 and secondary data acquired from Lake Bogoria National Game Reserve (LBNGR) for 2019 and 2020 were digitized using in a Geographic Information System (GIS). Measures of dispersion and point pattern analysis (PPA) were used to analyze dispersal of GK population using GIS. Spatio-temporal change of GK dispersal in LBNR was evident thus the null hypothesis was rejected. It is recommended that anthropogenic activities contributing to GK’s habitat degradation be curbed by providing alternative livelihood sources and promoting community adoption of sustainable technologies for improved livelihoods.展开更多
Human Activity Recognition(HAR)is an active research area due to its applications in pervasive computing,human-computer interaction,artificial intelligence,health care,and social sciences.Moreover,dynamic environments...Human Activity Recognition(HAR)is an active research area due to its applications in pervasive computing,human-computer interaction,artificial intelligence,health care,and social sciences.Moreover,dynamic environments and anthropometric differences between individuals make it harder to recognize actions.This study focused on human activity in video sequences acquired with an RGB camera because of its vast range of real-world applications.It uses two-stream ConvNet to extract spatial and temporal information and proposes a fine-tuned deep neural network.Moreover,the transfer learning paradigm is adopted to extract varied and fixed frames while reusing object identification information.Six state-of-the-art pre-trained models are exploited to find the best model for spatial feature extraction.For temporal sequence,this study uses dense optical flow following the two-stream ConvNet and Bidirectional Long Short TermMemory(BiLSTM)to capture longtermdependencies.Two state-of-the-art datasets,UCF101 and HMDB51,are used for evaluation purposes.In addition,seven state-of-the-art optimizers are used to fine-tune the proposed network parameters.Furthermore,this study utilizes an ensemble mechanism to aggregate spatial-temporal features using a four-stream Convolutional Neural Network(CNN),where two streams use RGB data.In contrast,the other uses optical flow images.Finally,the proposed ensemble approach using max hard voting outperforms state-ofthe-art methods with 96.30%and 90.07%accuracies on the UCF101 and HMDB51 datasets.展开更多
To illuminate the spatio-temporal variation characteristics and geochemical driving mechanism of soil pH in the Nenjiang River Basin,the National Multi-objective Regional Geochemical Survey data of topsoil,the Second ...To illuminate the spatio-temporal variation characteristics and geochemical driving mechanism of soil pH in the Nenjiang River Basin,the National Multi-objective Regional Geochemical Survey data of topsoil,the Second National Soil Survey data and Normalized Difference Vegetation Index(NDVI)were analyzed.The areas of neutral and alkaline soil decreased by 21100 km^(2)and 30500 km^(2),respectively,while that of strongly alkaline,extremely alkaline,and strongly acidic soil increased by 19600 km^(2),18200 km^(2),and 15500 km^(2),respectively,during the past 30 years.NDVI decreased with the increase of soil pH when soil pH>8.0,and it was reversed when soil pH<5.0.There were significant differences in soil pH with various surface cover types,which showed an ascending order:Arbor<reed<maize<rice<high and medium-covered meadow<low-covered meadow<Puccinellia.The weathering products of minerals rich in K_(2)O,Na_(2)O,CaO,and MgO entered into the low plain and were enriched in different parts by water transportation and lake deposition,while Fe and Al remained in the low hilly areas,which was the geochemical driving mechanism.The results of this study will provide scientific basis for making scientific and rational decisions on soil acidification and salinization.展开更多
With the development of the Internet of Things(IoT),spatio-temporal crowdsourcing(mobile crowdsourcing)has become an emerging paradigm for addressing location-based sensing tasks.However,the delay caused by network tr...With the development of the Internet of Things(IoT),spatio-temporal crowdsourcing(mobile crowdsourcing)has become an emerging paradigm for addressing location-based sensing tasks.However,the delay caused by network transmission has led to low data processing efficiency.Fortunately,edge computing can solve this problem,effectively reduce the delay of data transmission,and improve data processing capacity,so that the crowdsourcing platform can make better decisions faster.Therefore,this paper combines spatio-temporal crowdsourcing and edge computing to study the Multi-Objective Optimization Task Assignment(MOO-TA)problem in the edge computing environment.The proposed online incentive mechanism considers the task difficulty attribute to motivate crowd workers to perform sensing tasks in the unpopular area.In this paper,the Weighted and Multi-Objective Particle Swarm Combination(WAMOPSC)algorithm is proposed to maximize both platform’s and crowd workers’utility,so as to maximize social welfare.The algorithm combines the traditional Linear Weighted Summation(LWS)algorithm and Multi-Objective Particle Swarm Optimization(MOPSO)algorithm to find pareto optimal solutions of multi-objective optimization task assignment problem as much as possible for crowdsourcing platform to choose.Through comparison experiments on real data sets,the effectiveness and feasibility of the proposed method are evaluated.展开更多
Spatio-temporal analysis of drought provides valuable information for drought management and damage mitigation. In this study, the Standardized Precipitation Index at the time scale of 6 months (SPI-6) is selected to ...Spatio-temporal analysis of drought provides valuable information for drought management and damage mitigation. In this study, the Standardized Precipitation Index at the time scale of 6 months (SPI-6) is selected to reflect drought conditions in the North-Eastern coastal region of Vietnam. The drought events and their characteristics from 1981 to 2019 are detected at 9 meteorological stations and 10 Chirps rainfall stations. The spatio-temporal variation of drought in the study region is analyzed on the basis of the number, duration, severity, intensity, and peak of the detected drought events at the 19 stations. The results show that from 1981 to 2019 the drought events mainly occurred with 1-season duration and moderate intensity and peak. The number, duration, severity, and peak of the drought events were the greatest in the period 2001-2010 and were the smallest in the period 2011-2019. Among the 19 stations, the drought duration tends to decrease at 11 stations, increase at 7 stations, and has a slight variant at 1 station;the drought severity tends to decrease at 14 stations, increase at 4 stations, and has not a significant trend at 1 station;the drought intensity tends to decrease at 17 stations, increase at 1 station, and has a slight variant at 1 station;and the drought peak tends to decrease at 18 stations and increase at 1 station.展开更多
Spatio-temporal models are valuable tools for disease mapping and understanding the geographical distribution of diseases and temporal dynamics. Spatio-temporal models have been proven empirically to be very complex a...Spatio-temporal models are valuable tools for disease mapping and understanding the geographical distribution of diseases and temporal dynamics. Spatio-temporal models have been proven empirically to be very complex and this complexity has led many to oversimply and model the spatial and temporal dependencies independently. Unlike common practice, this study formulated a new spatio-temporal model in a Bayesian hierarchical framework that accounts for spatial and temporal dependencies jointly. The spatial and temporal dependencies were dynamically modelled via the matern exponential covariance function. The temporal aspect was captured by the parameters of the exponential with a first-order autoregressive structure. Inferences about the parameters were obtained via Markov Chain Monte Carlo (MCMC) techniques and the spatio-temporal maps were obtained by mapping stable posterior means from the specific location and time from the best model that includes the significant risk factors. The model formulated was fitted to both simulation data and Kenya meningitis incidence data from 2013 to 2019 along with two covariates;Gross County Product (GCP) and average rainfall. The study found that both average rainfall and GCP had a significant positive association with meningitis occurrence. Also, regarding geographical distribution, the spatio-temporal maps showed that meningitis is not evenly distributed across the country as some counties reported a high number of cases compared with other counties.展开更多
Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occ...Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occlusion,and limited labeled data.To address these challenges,we introduce a comprehensive methodology toenhance image classification and object detection accuracy.The proposed approach involves the integration ofmultiple methods in a complementary way.The process commences with the application of Gaussian filters tomitigate the impact of noise interference.These images are then processed for segmentation using Fuzzy C-Meanssegmentation in parallel with saliency mapping techniques to find the most prominent regions.The Binary RobustIndependent Elementary Features(BRIEF)characteristics are then extracted fromdata derived fromsaliency mapsand segmented images.For precise object separation,Oriented FAST and Rotated BRIEF(ORB)algorithms areemployed.Genetic Algorithms(GAs)are used to optimize Random Forest classifier parameters which lead toimproved performance.Our method stands out due to its comprehensive approach,adeptly addressing challengessuch as changing backdrops,occlusion,and limited labeled data concurrently.A significant enhancement hasbeen achieved by integrating Genetic Algorithms(GAs)to precisely optimize parameters.This minor adjustmentnot only boosts the uniqueness of our system but also amplifies its overall efficacy.The proposed methodologyhas demonstrated notable classification accuracies of 90.9%and 89.0%on the challenging Corel-1k and MSRCdatasets,respectively.Furthermore,detection accuracies of 87.2%and 86.6%have been attained.Although ourmethod performed well in both datasets it may face difficulties in real-world data especially where datasets havehighly complex backgrounds.Despite these limitations,GAintegration for parameter optimization shows a notablestrength in enhancing the overall adaptability and performance of our system.展开更多
In crime science, understanding the dynamics and interactions between crime events is crucial for comprehending the underlying factors that drive their occurrences. Nonetheless, gaining access to detailed spatiotempor...In crime science, understanding the dynamics and interactions between crime events is crucial for comprehending the underlying factors that drive their occurrences. Nonetheless, gaining access to detailed spatiotemporal crime records from law enforcement faces significant challenges due to confidentiality concerns. In response to these challenges, this paper introduces an innovative analytical tool named “stppSim,” designed to synthesize fine-grained spatiotemporal point records while safeguarding the privacy of individual locations. By utilizing the open-source R platform, this tool ensures easy accessibility for researchers, facilitating download, re-use, and potential advancements in various research domains beyond crime science.展开更多
The data analysis of blasting sites has always been the research goal of relevant researchers.The rise of mobile blasting robots has aroused many researchers’interest in machine learning methods for target detection ...The data analysis of blasting sites has always been the research goal of relevant researchers.The rise of mobile blasting robots has aroused many researchers’interest in machine learning methods for target detection in the field of blasting.Serverless Computing can provide a variety of computing services for people without hardware foundations and rich software development experience,which has aroused people’s interest in how to use it in the field ofmachine learning.In this paper,we design a distributedmachine learning training application based on the AWS Lambda platform.Based on data parallelism,the data aggregation and training synchronization in Function as a Service(FaaS)are effectively realized.It also encrypts the data set,effectively reducing the risk of data leakage.We rent a cloud server and a Lambda,and then we conduct experiments to evaluate our applications.Our results indicate the effectiveness,rapidity,and economy of distributed training on FaaS.展开更多
In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,...In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,diagnosis and evaluation of kidney and urinary tract disease,providing insight into the specific type and severity.However,manual urine sediment examination is labor-intensive,time-consuming,and subjective.Traditional machine learning based object detection methods require hand-crafted features for localization and classification,which have poor generalization capabilities and are difficult to quickly and accurately detect the number of urine sediments.Deep learning based object detection methods have the potential to address the challenges mentioned above,but these methods require access to large urine sediment image datasets.Unfortunately,only a limited number of publicly available urine sediment datasets are currently available.To alleviate the lack of urine sediment datasets in medical image analysis,we propose a new dataset named UriSed2K,which contains 2465 high-quality images annotated with expert guidance.Two main challenges are associated with our dataset:a large number of small objects and the occlusion between these small objects.Our manuscript focuses on applying deep learning object detection methods to the urine sediment dataset and addressing the challenges presented by this dataset.Specifically,our goal is to improve the accuracy and efficiency of the detection algorithm and,in doing so,provide medical professionals with an automatic detector that saves time and effort.We propose an improved lightweight one-stage object detection algorithm called Discriminatory-YOLO.The proposed algorithm comprises a local context attention module and a global background suppression module,which aid the detector in distinguishing urine sediment features in the image.The local context attention module captures context information beyond the object region,while the global background suppression module emphasizes objects in uninformative backgrounds.We comprehensively evaluate our method on the UriSed2K dataset,which includes seven categories of urine sediments,such as erythrocytes(red blood cells),leukocytes(white blood cells),epithelial cells,crystals,mycetes,broken erythrocytes,and broken leukocytes,achieving the best average precision(AP)of 95.3%while taking only 10 ms per image.The source code and dataset are available at https://github.com/binghuiwu98/discriminatoryyolov5.展开更多
Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unman...Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.展开更多
Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of labelling.However,there is a large performance gap between weakly supervised and fully superv...Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of labelling.However,there is a large performance gap between weakly supervised and fully supervised salient object detectors because the scribble annotation can only provide very limited foreground/background information.Therefore,an intuitive idea is to infer annotations that cover more complete object and background regions for training.To this end,a label inference strategy is proposed based on the assumption that pixels with similar colours and close positions should have consistent labels.Specifically,k-means clustering algorithm was first performed on both colours and coordinates of original annotations,and then assigned the same labels to points having similar colours with colour cluster centres and near coordinate cluster centres.Next,the same annotations for pixels with similar colours within each kernel neighbourhood was set further.Extensive experiments on six benchmarks demonstrate that our method can significantly improve the performance and achieve the state-of-the-art results.展开更多
Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false...Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.展开更多
Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input t...Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.展开更多
In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in re...In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in remote sensing remains a formidable challenge.The deep network structure will bring about the loss of object features,resulting in the loss of object features and the near elimination of some subtle features associated with small objects in deep layers.Additionally,the features of small objects are susceptible to interference from background features contained within the image,leading to a decline in detection accuracy.Moreover,the sensitivity of small objects to the bounding box perturbation further increases the detection difficulty.In this paper,we introduce a novel approach,Cross-Layer Fusion and Weighted Receptive Field-based YOLO(CAW-YOLO),specifically designed for small object detection in remote sensing.To address feature loss in deep layers,we have devised a cross-layer attention fusion module.Background noise is effectively filtered through the incorporation of Bi-Level Routing Attention(BRA).To enhance the model’s capacity to perceive multi-scale objects,particularly small-scale objects,we introduce a weightedmulti-receptive field atrous spatial pyramid poolingmodule.Furthermore,wemitigate the sensitivity arising from bounding box perturbation by incorporating the joint Normalized Wasserstein Distance(NWD)and Efficient Intersection over Union(EIoU)losses.The efficacy of the proposedmodel in detecting small objects in remote sensing has been validated through experiments conducted on three publicly available datasets.The experimental results unequivocally demonstrate the model’s pronounced advantages in small object detection for remote sensing,surpassing the performance of current mainstream models.展开更多
Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variati...Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.展开更多
Insulator defect detection plays a vital role in maintaining the secure operation of power systems.To address the issues of the difficulty of detecting small objects and missing objects due to the small scale,variable...Insulator defect detection plays a vital role in maintaining the secure operation of power systems.To address the issues of the difficulty of detecting small objects and missing objects due to the small scale,variable scale,and fuzzy edge morphology of insulator defects,we construct an insulator dataset with 1600 samples containing flashovers and breakages.Then a simple and effective surface defect detection method of power line insulators for difficult small objects is proposed.Firstly,a high-resolution featuremap is introduced and a small object prediction layer is added so that the model can detect tiny objects.Secondly,a simplified adaptive spatial feature fusion(SASFF)module is introduced to perform cross-scale spatial fusion to improve adaptability to variable multi-scale features.Finally,we propose an enhanced deformable attention mechanism(EDAM)module.By integrating a gating activation function,the model is further inspired to learn a small number of critical sampling points near reference points.And the module can improve the perception of object morphology.The experimental results indicate that concerning the dataset of flashover and breakage defects,this method improves the performance of YOLOv5,YOLOv7,and YOLOv8.In practical application,it can simply and effectively improve the precision of power line insulator defect detection and reduce missing detection for difficult small objects.展开更多
AIM:To evaluate the effect of low-degree astigmatism on objective visual quality through the Optical Quality Analysis System(OQAS).METHODS:This study enrolled 46 participants(aged 23 to 30y,90 eyes)with normal or corr...AIM:To evaluate the effect of low-degree astigmatism on objective visual quality through the Optical Quality Analysis System(OQAS).METHODS:This study enrolled 46 participants(aged 23 to 30y,90 eyes)with normal or corrected-to-normal vision.The cylindrical lenses(0,0.5,0.75,1.0,and 1.25 D)were placed at the axial direction(180°,45°,90°,and 135°)in front of the eyes with the best correction to form 16 types of regular low-degree astigmatism.OQAS was used to detect the objective visual quality,recorded as the objective scattering index(OSI),OQAS values at contrasts of 100%,20%,and 9%predictive visual acuity(OV100%,OV20%,and OV9%),modulation transfer function cut-off(MTFcut-off)and Strehl ratio(SR).The mixed effect linear model was used to compare objective visual quality differences between groups and examine associations between astigmatic magnitude and objective visual quality parameters.RESULTS:Apparent negative relationships between the magnitude of low astigmatism and objective visual quality were observed.The increase of OSI per degree of astigmatism at 180°,45°,90°,and 135°axis were 0.38(95%CI:0.35,0.42),0.50(95%CI:0.46,0.53),0.49(95%CI:0.45,0.54)and 0.37(95%CI:0.34,0.41),respectively.The decrease of MTFcut-off per degree of astigmatism at 180°,45°,90°,and 135°axis were-10.30(95%CI:-11.43,-9.16),-12.73(95%CI:-13.62,-11.86),-12.75(95%CI:-13.79,-11.70),and-9.97(95%CI:-10.92,-9.03),respectively.At the same astigmatism degree,OSI at 45°and 90°axis were higher than that at 0°and 135°axis,while MTFcut-off were lower.CONCLUSION:Low astigmatism of only 0.50 D can significantly reduce the objective visual quality.展开更多
基金supported by Beijing Insititute of Technology Research Fund Program for Young Scholars(2020X04104)。
文摘In this paper,an improved spatio-temporal alignment measurement method is presented to address the inertial matching measurement of hull deformation under the coexistence of time delay and large misalignment angle.Large misalignment angle and time delay often occur simultaneously and bring great challenges to the accurate measurement of hull deformation in space and time.The proposed method utilizes coarse alignment with large misalignment angle and time delay estimation of inertial measurement unit modeling to establish a brand-new spatiotemporal aligned hull deformation measurement model.In addition,two-step loop control is designed to ensure the accurate description of dynamic deformation angle and static deformation angle by the time-space alignment method of hull deformation.The experiments illustrate that the proposed method can effectively measure the hull deformation angle when time delay and large misalignment angle coexist.
基金the National Natural Science Foundation of China(NNSFC)(Grant Nos.72001213 and 72301292)the National Social Science Fund of China(Grant No.19BGL297)the Basic Research Program of Natural Science in Shaanxi Province(Grant No.2021JQ-369).
文摘Due to the time-varying topology and possible disturbances in a conflict environment,it is still challenging to maintain the mission performance of flying Ad hoc networks(FANET),which limits the application of Unmanned Aerial Vehicle(UAV)swarms in harsh environments.This paper proposes an intelligent framework to quickly recover the cooperative coveragemission by aggregating the historical spatio-temporal network with the attention mechanism.The mission resilience metric is introduced in conjunction with connectivity and coverage status information to simplify the optimization model.A spatio-temporal node pooling method is proposed to ensure all node location features can be updated after destruction by capturing the temporal network structure.Combined with the corresponding Laplacian matrix as the hyperparameter,a recovery algorithm based on the multi-head attention graph network is designed to achieve rapid recovery.Simulation results showed that the proposed framework can facilitate rapid recovery of the connectivity and coverage more effectively compared to the existing studies.The results demonstrate that the average connectivity and coverage results is improved by 17.92%and 16.96%,respectively compared with the state-of-the-art model.Furthermore,by the ablation study,the contributions of each different improvement are compared.The proposed model can be used to support resilient network design for real-time mission execution.
文摘Decline in wildlife populations is manifest globally, regionally and locally. A wildlife decline of 68% has been reported in Kenya’s rangelands with Baringo County experiencing more than 85% wildlife loss in the last four decades. Greater Kudu (Tragelaphus strepsiceros) is endemic to Lake Bogoria landscape in Baringo County and constitutes a major tourist attraction for the region necessitating use of its photo on the County’s logo and thus a flagship species. Tourism plays a central role in Baringo County’s economy and is a major source of potential growth and employment creation. The study was carried out to assess spatio-temporal change of dispersal areas of Greater Kudu (GK) in Lake Bogoria landscape in the last four years for enhanced adaptive management and improved livelihoods. GK population distribution primary data collected in December 2022 and secondary data acquired from Lake Bogoria National Game Reserve (LBNGR) for 2019 and 2020 were digitized using in a Geographic Information System (GIS). Measures of dispersion and point pattern analysis (PPA) were used to analyze dispersal of GK population using GIS. Spatio-temporal change of GK dispersal in LBNR was evident thus the null hypothesis was rejected. It is recommended that anthropogenic activities contributing to GK’s habitat degradation be curbed by providing alternative livelihood sources and promoting community adoption of sustainable technologies for improved livelihoods.
基金This work was supported by financial support from Universiti Sains Malaysia(USM)under FRGS grant number FRGS/1/2020/TK03/USM/02/1the School of Computer Sciences USM for their support.
文摘Human Activity Recognition(HAR)is an active research area due to its applications in pervasive computing,human-computer interaction,artificial intelligence,health care,and social sciences.Moreover,dynamic environments and anthropometric differences between individuals make it harder to recognize actions.This study focused on human activity in video sequences acquired with an RGB camera because of its vast range of real-world applications.It uses two-stream ConvNet to extract spatial and temporal information and proposes a fine-tuned deep neural network.Moreover,the transfer learning paradigm is adopted to extract varied and fixed frames while reusing object identification information.Six state-of-the-art pre-trained models are exploited to find the best model for spatial feature extraction.For temporal sequence,this study uses dense optical flow following the two-stream ConvNet and Bidirectional Long Short TermMemory(BiLSTM)to capture longtermdependencies.Two state-of-the-art datasets,UCF101 and HMDB51,are used for evaluation purposes.In addition,seven state-of-the-art optimizers are used to fine-tune the proposed network parameters.Furthermore,this study utilizes an ensemble mechanism to aggregate spatial-temporal features using a four-stream Convolutional Neural Network(CNN),where two streams use RGB data.In contrast,the other uses optical flow images.Finally,the proposed ensemble approach using max hard voting outperforms state-ofthe-art methods with 96.30%and 90.07%accuracies on the UCF101 and HMDB51 datasets.
基金supported by China Geological Survey(DD20230554,DD20230089)the Strategic Priority Research Program of the Chinese Academy of Science(XDA28020302)the funding project of Northeast Geological S&T Innovation Center of China Geological Survey(QCJJ2022-40).
文摘To illuminate the spatio-temporal variation characteristics and geochemical driving mechanism of soil pH in the Nenjiang River Basin,the National Multi-objective Regional Geochemical Survey data of topsoil,the Second National Soil Survey data and Normalized Difference Vegetation Index(NDVI)were analyzed.The areas of neutral and alkaline soil decreased by 21100 km^(2)and 30500 km^(2),respectively,while that of strongly alkaline,extremely alkaline,and strongly acidic soil increased by 19600 km^(2),18200 km^(2),and 15500 km^(2),respectively,during the past 30 years.NDVI decreased with the increase of soil pH when soil pH>8.0,and it was reversed when soil pH<5.0.There were significant differences in soil pH with various surface cover types,which showed an ascending order:Arbor<reed<maize<rice<high and medium-covered meadow<low-covered meadow<Puccinellia.The weathering products of minerals rich in K_(2)O,Na_(2)O,CaO,and MgO entered into the low plain and were enriched in different parts by water transportation and lake deposition,while Fe and Al remained in the low hilly areas,which was the geochemical driving mechanism.The results of this study will provide scientific basis for making scientific and rational decisions on soil acidification and salinization.
基金supported in part by the National Natural Science Foundation of China under Grant 61822602,Grant 61772207,Grant 61802331,Grant 61572418,Grant 61602399,Grant 61702439 and Grant 61773331the China Postdoctoral Science Foundation under Grant 2019T120732 and Grant 2017M622691+1 种基金the National Science Foundation(NSF)under Grant 1704287,Grant 1252292 and Grant 1741277the Natural Science Foundation of Shandong Province under Grant ZR2016FM42.
文摘With the development of the Internet of Things(IoT),spatio-temporal crowdsourcing(mobile crowdsourcing)has become an emerging paradigm for addressing location-based sensing tasks.However,the delay caused by network transmission has led to low data processing efficiency.Fortunately,edge computing can solve this problem,effectively reduce the delay of data transmission,and improve data processing capacity,so that the crowdsourcing platform can make better decisions faster.Therefore,this paper combines spatio-temporal crowdsourcing and edge computing to study the Multi-Objective Optimization Task Assignment(MOO-TA)problem in the edge computing environment.The proposed online incentive mechanism considers the task difficulty attribute to motivate crowd workers to perform sensing tasks in the unpopular area.In this paper,the Weighted and Multi-Objective Particle Swarm Combination(WAMOPSC)algorithm is proposed to maximize both platform’s and crowd workers’utility,so as to maximize social welfare.The algorithm combines the traditional Linear Weighted Summation(LWS)algorithm and Multi-Objective Particle Swarm Optimization(MOPSO)algorithm to find pareto optimal solutions of multi-objective optimization task assignment problem as much as possible for crowdsourcing platform to choose.Through comparison experiments on real data sets,the effectiveness and feasibility of the proposed method are evaluated.
文摘Spatio-temporal analysis of drought provides valuable information for drought management and damage mitigation. In this study, the Standardized Precipitation Index at the time scale of 6 months (SPI-6) is selected to reflect drought conditions in the North-Eastern coastal region of Vietnam. The drought events and their characteristics from 1981 to 2019 are detected at 9 meteorological stations and 10 Chirps rainfall stations. The spatio-temporal variation of drought in the study region is analyzed on the basis of the number, duration, severity, intensity, and peak of the detected drought events at the 19 stations. The results show that from 1981 to 2019 the drought events mainly occurred with 1-season duration and moderate intensity and peak. The number, duration, severity, and peak of the drought events were the greatest in the period 2001-2010 and were the smallest in the period 2011-2019. Among the 19 stations, the drought duration tends to decrease at 11 stations, increase at 7 stations, and has a slight variant at 1 station;the drought severity tends to decrease at 14 stations, increase at 4 stations, and has not a significant trend at 1 station;the drought intensity tends to decrease at 17 stations, increase at 1 station, and has a slight variant at 1 station;and the drought peak tends to decrease at 18 stations and increase at 1 station.
文摘Spatio-temporal models are valuable tools for disease mapping and understanding the geographical distribution of diseases and temporal dynamics. Spatio-temporal models have been proven empirically to be very complex and this complexity has led many to oversimply and model the spatial and temporal dependencies independently. Unlike common practice, this study formulated a new spatio-temporal model in a Bayesian hierarchical framework that accounts for spatial and temporal dependencies jointly. The spatial and temporal dependencies were dynamically modelled via the matern exponential covariance function. The temporal aspect was captured by the parameters of the exponential with a first-order autoregressive structure. Inferences about the parameters were obtained via Markov Chain Monte Carlo (MCMC) techniques and the spatio-temporal maps were obtained by mapping stable posterior means from the specific location and time from the best model that includes the significant risk factors. The model formulated was fitted to both simulation data and Kenya meningitis incidence data from 2013 to 2019 along with two covariates;Gross County Product (GCP) and average rainfall. The study found that both average rainfall and GCP had a significant positive association with meningitis occurrence. Also, regarding geographical distribution, the spatio-temporal maps showed that meningitis is not evenly distributed across the country as some counties reported a high number of cases compared with other counties.
基金a grant from the Basic Science Research Program through the National Research Foundation(NRF)(2021R1F1A1063634)funded by the Ministry of Science and ICT(MSIT)Republic of Korea.This research is supported and funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2024R410)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding program Grant Code(NU/RG/SERC/12/6).
文摘Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occlusion,and limited labeled data.To address these challenges,we introduce a comprehensive methodology toenhance image classification and object detection accuracy.The proposed approach involves the integration ofmultiple methods in a complementary way.The process commences with the application of Gaussian filters tomitigate the impact of noise interference.These images are then processed for segmentation using Fuzzy C-Meanssegmentation in parallel with saliency mapping techniques to find the most prominent regions.The Binary RobustIndependent Elementary Features(BRIEF)characteristics are then extracted fromdata derived fromsaliency mapsand segmented images.For precise object separation,Oriented FAST and Rotated BRIEF(ORB)algorithms areemployed.Genetic Algorithms(GAs)are used to optimize Random Forest classifier parameters which lead toimproved performance.Our method stands out due to its comprehensive approach,adeptly addressing challengessuch as changing backdrops,occlusion,and limited labeled data concurrently.A significant enhancement hasbeen achieved by integrating Genetic Algorithms(GAs)to precisely optimize parameters.This minor adjustmentnot only boosts the uniqueness of our system but also amplifies its overall efficacy.The proposed methodologyhas demonstrated notable classification accuracies of 90.9%and 89.0%on the challenging Corel-1k and MSRCdatasets,respectively.Furthermore,detection accuracies of 87.2%and 86.6%have been attained.Although ourmethod performed well in both datasets it may face difficulties in real-world data especially where datasets havehighly complex backgrounds.Despite these limitations,GAintegration for parameter optimization shows a notablestrength in enhancing the overall adaptability and performance of our system.
文摘In crime science, understanding the dynamics and interactions between crime events is crucial for comprehending the underlying factors that drive their occurrences. Nonetheless, gaining access to detailed spatiotemporal crime records from law enforcement faces significant challenges due to confidentiality concerns. In response to these challenges, this paper introduces an innovative analytical tool named “stppSim,” designed to synthesize fine-grained spatiotemporal point records while safeguarding the privacy of individual locations. By utilizing the open-source R platform, this tool ensures easy accessibility for researchers, facilitating download, re-use, and potential advancements in various research domains beyond crime science.
文摘The data analysis of blasting sites has always been the research goal of relevant researchers.The rise of mobile blasting robots has aroused many researchers’interest in machine learning methods for target detection in the field of blasting.Serverless Computing can provide a variety of computing services for people without hardware foundations and rich software development experience,which has aroused people’s interest in how to use it in the field ofmachine learning.In this paper,we design a distributedmachine learning training application based on the AWS Lambda platform.Based on data parallelism,the data aggregation and training synchronization in Function as a Service(FaaS)are effectively realized.It also encrypts the data set,effectively reducing the risk of data leakage.We rent a cloud server and a Lambda,and then we conduct experiments to evaluate our applications.Our results indicate the effectiveness,rapidity,and economy of distributed training on FaaS.
基金This work was partially supported by the National Natural Science Foundation of China(Grant Nos.61906168,U20A20171)Zhejiang Provincial Natural Science Foundation of China(Grant Nos.LY23F020023,LY21F020027)Construction of Hubei Provincial Key Laboratory for Intelligent Visual Monitoring of Hydropower Projects(Grant Nos.2022SDSJ01).
文摘In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,diagnosis and evaluation of kidney and urinary tract disease,providing insight into the specific type and severity.However,manual urine sediment examination is labor-intensive,time-consuming,and subjective.Traditional machine learning based object detection methods require hand-crafted features for localization and classification,which have poor generalization capabilities and are difficult to quickly and accurately detect the number of urine sediments.Deep learning based object detection methods have the potential to address the challenges mentioned above,but these methods require access to large urine sediment image datasets.Unfortunately,only a limited number of publicly available urine sediment datasets are currently available.To alleviate the lack of urine sediment datasets in medical image analysis,we propose a new dataset named UriSed2K,which contains 2465 high-quality images annotated with expert guidance.Two main challenges are associated with our dataset:a large number of small objects and the occlusion between these small objects.Our manuscript focuses on applying deep learning object detection methods to the urine sediment dataset and addressing the challenges presented by this dataset.Specifically,our goal is to improve the accuracy and efficiency of the detection algorithm and,in doing so,provide medical professionals with an automatic detector that saves time and effort.We propose an improved lightweight one-stage object detection algorithm called Discriminatory-YOLO.The proposed algorithm comprises a local context attention module and a global background suppression module,which aid the detector in distinguishing urine sediment features in the image.The local context attention module captures context information beyond the object region,while the global background suppression module emphasizes objects in uninformative backgrounds.We comprehensively evaluate our method on the UriSed2K dataset,which includes seven categories of urine sediments,such as erythrocytes(red blood cells),leukocytes(white blood cells),epithelial cells,crystals,mycetes,broken erythrocytes,and broken leukocytes,achieving the best average precision(AP)of 95.3%while taking only 10 ms per image.The source code and dataset are available at https://github.com/binghuiwu98/discriminatoryyolov5.
基金This research was funded by the Natural Science Foundation of Hebei Province(F2021506004).
文摘Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.
文摘Recently,weak supervision has received growing attention in the field of salient object detection due to the convenience of labelling.However,there is a large performance gap between weakly supervised and fully supervised salient object detectors because the scribble annotation can only provide very limited foreground/background information.Therefore,an intuitive idea is to infer annotations that cover more complete object and background regions for training.To this end,a label inference strategy is proposed based on the assumption that pixels with similar colours and close positions should have consistent labels.Specifically,k-means clustering algorithm was first performed on both colours and coordinates of original annotations,and then assigned the same labels to points having similar colours with colour cluster centres and near coordinate cluster centres.Next,the same annotations for pixels with similar colours within each kernel neighbourhood was set further.Extensive experiments on six benchmarks demonstrate that our method can significantly improve the performance and achieve the state-of-the-art results.
基金the Scientific Research Fund of Hunan Provincial Education Department(23A0423).
文摘Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.
基金supported in part by the Major Project for New Generation of AI (2018AAA0100400)the National Natural Science Foundation of China (61836014,U21B2042,62072457,62006231)the InnoHK Program。
文摘Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.
基金supported in part by the National Natural Science Foundation of China under Grant 62006071part by the Science and Technology Research Project of Henan Province under Grant 232103810086.
文摘In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in remote sensing remains a formidable challenge.The deep network structure will bring about the loss of object features,resulting in the loss of object features and the near elimination of some subtle features associated with small objects in deep layers.Additionally,the features of small objects are susceptible to interference from background features contained within the image,leading to a decline in detection accuracy.Moreover,the sensitivity of small objects to the bounding box perturbation further increases the detection difficulty.In this paper,we introduce a novel approach,Cross-Layer Fusion and Weighted Receptive Field-based YOLO(CAW-YOLO),specifically designed for small object detection in remote sensing.To address feature loss in deep layers,we have devised a cross-layer attention fusion module.Background noise is effectively filtered through the incorporation of Bi-Level Routing Attention(BRA).To enhance the model’s capacity to perceive multi-scale objects,particularly small-scale objects,we introduce a weightedmulti-receptive field atrous spatial pyramid poolingmodule.Furthermore,wemitigate the sensitivity arising from bounding box perturbation by incorporating the joint Normalized Wasserstein Distance(NWD)and Efficient Intersection over Union(EIoU)losses.The efficacy of the proposedmodel in detecting small objects in remote sensing has been validated through experiments conducted on three publicly available datasets.The experimental results unequivocally demonstrate the model’s pronounced advantages in small object detection for remote sensing,surpassing the performance of current mainstream models.
基金the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2023GXJS163,ZDYF2024GXJS014)National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)+2 种基金the Major Science and Technology Project of Hainan Province(Grant No.ZDKJ2020012)Hainan Provincial Natural Science Foundation of China(Grant No.620MS021)Youth Foundation Project of Hainan Natural Science Foundation(621QN211).
文摘Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.
基金State Grid Jiangsu Electric Power Co.,Ltd.of the Science and Technology Project(Grant No.J2022004).
文摘Insulator defect detection plays a vital role in maintaining the secure operation of power systems.To address the issues of the difficulty of detecting small objects and missing objects due to the small scale,variable scale,and fuzzy edge morphology of insulator defects,we construct an insulator dataset with 1600 samples containing flashovers and breakages.Then a simple and effective surface defect detection method of power line insulators for difficult small objects is proposed.Firstly,a high-resolution featuremap is introduced and a small object prediction layer is added so that the model can detect tiny objects.Secondly,a simplified adaptive spatial feature fusion(SASFF)module is introduced to perform cross-scale spatial fusion to improve adaptability to variable multi-scale features.Finally,we propose an enhanced deformable attention mechanism(EDAM)module.By integrating a gating activation function,the model is further inspired to learn a small number of critical sampling points near reference points.And the module can improve the perception of object morphology.The experimental results indicate that concerning the dataset of flashover and breakage defects,this method improves the performance of YOLOv5,YOLOv7,and YOLOv8.In practical application,it can simply and effectively improve the precision of power line insulator defect detection and reduce missing detection for difficult small objects.
文摘AIM:To evaluate the effect of low-degree astigmatism on objective visual quality through the Optical Quality Analysis System(OQAS).METHODS:This study enrolled 46 participants(aged 23 to 30y,90 eyes)with normal or corrected-to-normal vision.The cylindrical lenses(0,0.5,0.75,1.0,and 1.25 D)were placed at the axial direction(180°,45°,90°,and 135°)in front of the eyes with the best correction to form 16 types of regular low-degree astigmatism.OQAS was used to detect the objective visual quality,recorded as the objective scattering index(OSI),OQAS values at contrasts of 100%,20%,and 9%predictive visual acuity(OV100%,OV20%,and OV9%),modulation transfer function cut-off(MTFcut-off)and Strehl ratio(SR).The mixed effect linear model was used to compare objective visual quality differences between groups and examine associations between astigmatic magnitude and objective visual quality parameters.RESULTS:Apparent negative relationships between the magnitude of low astigmatism and objective visual quality were observed.The increase of OSI per degree of astigmatism at 180°,45°,90°,and 135°axis were 0.38(95%CI:0.35,0.42),0.50(95%CI:0.46,0.53),0.49(95%CI:0.45,0.54)and 0.37(95%CI:0.34,0.41),respectively.The decrease of MTFcut-off per degree of astigmatism at 180°,45°,90°,and 135°axis were-10.30(95%CI:-11.43,-9.16),-12.73(95%CI:-13.62,-11.86),-12.75(95%CI:-13.79,-11.70),and-9.97(95%CI:-10.92,-9.03),respectively.At the same astigmatism degree,OSI at 45°and 90°axis were higher than that at 0°and 135°axis,while MTFcut-off were lower.CONCLUSION:Low astigmatism of only 0.50 D can significantly reduce the objective visual quality.