期刊文献+
共找到2,910篇文章
< 1 2 146 >
每页显示 20 50 100
基于改进Deformable DETR的无人机视频流车辆目标检测算法
1
作者 江志鹏 王自全 +4 位作者 张永生 于英 程彬彬 赵龙海 张梦唯 《计算机工程与科学》 CSCD 北大核心 2024年第1期91-101,共11页
针对无人机视频流检测中小目标数量多、因图像传输质量较低而导致的上下文语义信息不充分、传统算法融合特征推理速度慢、数据集类别样本不均衡导致的训练效果差等问题,提出一种基于改进Deformable DETR的无人机视频流车辆目标检测算法... 针对无人机视频流检测中小目标数量多、因图像传输质量较低而导致的上下文语义信息不充分、传统算法融合特征推理速度慢、数据集类别样本不均衡导致的训练效果差等问题,提出一种基于改进Deformable DETR的无人机视频流车辆目标检测算法。在模型结构方面,该算法设计了跨尺度特征融合模块以增大感受野,提升小目标检测能力,并采用针对object_query的挤压-激励模块提升关键目标的响应值,减少重要目标的漏检与错检率;在数据处理方面,使用了在线困难样本挖掘技术,改善数据集中类别样本分布不均的问题。在UAVDT数据集上进行了实验,实验结果表明,改进后的算法相较于基线算法在平均检测精度上提升了1.5%,在小目标检测精度上提升了0.8%,并在保持参数量较少增长的情况下,维持了原有的检测速度。 展开更多
关键词 Deformable DETR 目标检测 跨尺度特征融合模块 object query挤压-激励 在线难样本挖掘
下载PDF
Automatic detection of small bowel lesions with different bleeding risks based on deep learning models 被引量:1
2
作者 Rui-Ya Zhang Peng-Peng Qiang +5 位作者 Ling-Jun Cai Tao Li Yan Qin Yu Zhang Yi-Qing Zhao Jun-Ping Wang 《World Journal of Gastroenterology》 SCIE CAS 2024年第2期170-183,共14页
BACKGROUND Deep learning provides an efficient automatic image recognition method for small bowel(SB)capsule endoscopy(CE)that can assist physicians in diagnosis.However,the existing deep learning models present some ... BACKGROUND Deep learning provides an efficient automatic image recognition method for small bowel(SB)capsule endoscopy(CE)that can assist physicians in diagnosis.However,the existing deep learning models present some unresolved challenges.AIM To propose a novel and effective classification and detection model to automatically identify various SB lesions and their bleeding risks,and label the lesions accurately so as to enhance the diagnostic efficiency of physicians and the ability to identify high-risk bleeding groups.METHODS The proposed model represents a two-stage method that combined image classification with object detection.First,we utilized the improved ResNet-50 classification model to classify endoscopic images into SB lesion images,normal SB mucosa images,and invalid images.Then,the improved YOLO-V5 detection model was utilized to detect the type of lesion and its risk of bleeding,and the location of the lesion was marked.We constructed training and testing sets and compared model-assisted reading with physician reading.RESULTS The accuracy of the model constructed in this study reached 98.96%,which was higher than the accuracy of other systems using only a single module.The sensitivity,specificity,and accuracy of the model-assisted reading detection of all images were 99.17%,99.92%,and 99.86%,which were significantly higher than those of the endoscopists’diagnoses.The image processing time of the model was 48 ms/image,and the image processing time of the physicians was 0.40±0.24 s/image(P<0.001).CONCLUSION The deep learning model of image classification combined with object detection exhibits a satisfactory diagnostic effect on a variety of SB lesions and their bleeding risks in CE images,which enhances the diagnostic efficiency of physicians and improves the ability of physicians to identify high-risk bleeding groups. 展开更多
关键词 Artificial intelligence Deep learning Capsule endoscopy Image classification Object detection Bleeding risk
下载PDF
Efficient Ship:A Hybrid Deep Learning Framework for Ship Detection in the River
3
作者 Huafeng Chen Junxing Xue +2 位作者 Hanyun Wen Yurong Hu Yudong Zhang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第1期301-320,共20页
Optical image-based ship detection can ensure the safety of ships and promote the orderly management of ships in offshore waters.Current deep learning researches on optical image-based ship detection mainly focus on i... Optical image-based ship detection can ensure the safety of ships and promote the orderly management of ships in offshore waters.Current deep learning researches on optical image-based ship detection mainly focus on improving one-stage detectors for real-time ship detection but sacrifices the accuracy of detection.To solve this problem,we present a hybrid ship detection framework which is named EfficientShip in this paper.The core parts of the EfficientShip are DLA-backboned object location(DBOL)and CascadeRCNN-guided object classification(CROC).The DBOL is responsible for finding potential ship objects,and the CROC is used to categorize the potential ship objects.We also design a pixel-spatial-level data augmentation(PSDA)to reduce the risk of detection model overfitting.We compare the proposed EfficientShip with state-of-the-art(SOTA)literature on a ship detection dataset called Seaships.Experiments show our ship detection framework achieves a result of 99.63%(mAP)at 45 fps,which is much better than 8 SOTA approaches on detection accuracy and can also meet the requirements of real-time application scenarios. 展开更多
关键词 Ship detection deep learning data augmentation object location object classification
下载PDF
Road Traffic Monitoring from Aerial Images Using Template Matching and Invariant Features
4
作者 Asifa Mehmood Qureshi Naif Al Mudawi +2 位作者 Mohammed Alonazi Samia Allaoua Chelloug Jeongmin Park 《Computers, Materials & Continua》 SCIE EI 2024年第3期3683-3701,共19页
Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibilit... Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved. 展开更多
关键词 Unmanned Aerial Vehicles(UAV) aerial images DATASET object detection object tracking data elimination template matching blob detection SIFT VAID
下载PDF
Enhanced Object Detection and Classification via Multi-Method Fusion
5
作者 Muhammad Waqas Ahmed Nouf Abdullah Almujally +2 位作者 Abdulwahab Alazeb Asaad Algarni Jeongmin Park 《Computers, Materials & Continua》 SCIE EI 2024年第5期3315-3331,共17页
Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occ... Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occlusion,and limited labeled data.To address these challenges,we introduce a comprehensive methodology toenhance image classification and object detection accuracy.The proposed approach involves the integration ofmultiple methods in a complementary way.The process commences with the application of Gaussian filters tomitigate the impact of noise interference.These images are then processed for segmentation using Fuzzy C-Meanssegmentation in parallel with saliency mapping techniques to find the most prominent regions.The Binary RobustIndependent Elementary Features(BRIEF)characteristics are then extracted fromdata derived fromsaliency mapsand segmented images.For precise object separation,Oriented FAST and Rotated BRIEF(ORB)algorithms areemployed.Genetic Algorithms(GAs)are used to optimize Random Forest classifier parameters which lead toimproved performance.Our method stands out due to its comprehensive approach,adeptly addressing challengessuch as changing backdrops,occlusion,and limited labeled data concurrently.A significant enhancement hasbeen achieved by integrating Genetic Algorithms(GAs)to precisely optimize parameters.This minor adjustmentnot only boosts the uniqueness of our system but also amplifies its overall efficacy.The proposed methodologyhas demonstrated notable classification accuracies of 90.9%and 89.0%on the challenging Corel-1k and MSRCdatasets,respectively.Furthermore,detection accuracies of 87.2%and 86.6%have been attained.Although ourmethod performed well in both datasets it may face difficulties in real-world data especially where datasets havehighly complex backgrounds.Despite these limitations,GAintegration for parameter optimization shows a notablestrength in enhancing the overall adaptability and performance of our system. 展开更多
关键词 BRIEF features saliency map fuzzy c-means object detection object recognition
下载PDF
Two-Layer Attention Feature Pyramid Network for Small Object Detection
6
作者 Sheng Xiang Junhao Ma +2 位作者 Qunli Shang Xianbao Wang Defu Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期713-731,共19页
Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain les... Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors. 展开更多
关键词 Small object detection two-layer attention module small object detail enhancement module feature pyramid network
下载PDF
Construction Activity Analysis of Workers Based on Human Posture Estimation Information
7
作者 Xuhong Zhou Shuai Li +2 位作者 Jiepeng Liu Zhou Wu Yohchia Frank Chen 《Engineering》 SCIE EI CAS CSCD 2024年第2期225-236,共12页
Identifying workers’construction activities or behaviors can enable managers to better monitor labor efficiency and construction progress.However,current activity analysis methods for construction workers rely solely... Identifying workers’construction activities or behaviors can enable managers to better monitor labor efficiency and construction progress.However,current activity analysis methods for construction workers rely solely on manual observations and recordings,which consumes considerable time and has high labor costs.Researchers have focused on monitoring on-site construction activities of workers.However,when multiple workers are working together,current research cannot accu rately and automatically identify the construction activity.This research proposes a deep learning framework for the automated analysis of the construction activities of multiple workers.In this framework,multiple deep neural network models are designed and used to complete worker key point extraction,worker tracking,and worker construction activity analysis.The designed framework was tested at an actual construction site,and activity recognition for multiple workers was performed,indicating the feasibility of the framework for the automated monitoring of work efficiency. 展开更多
关键词 Pose estimation Activity analysis Object tracking Construction workers Automatic systems
下载PDF
Exploring Deep Learning Methods for Computer Vision Applications across Multiple Sectors:Challenges and Future Trends
8
作者 Narayanan Ganesh Rajendran Shankar +3 位作者 Miroslav Mahdal Janakiraman SenthilMurugan Jasgurpreet Singh Chohan Kanak Kalita 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期103-141,共39页
Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than ot... Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than other traditional machine learning(ML)methods inCV.DL techniques can produce state-of-the-art results for difficult CV problems like picture categorization,object detection,and face recognition.In this review,a structured discussion on the history,methods,and applications of DL methods to CV problems is presented.The sector-wise presentation of applications in this papermay be particularly useful for researchers in niche fields who have limited or introductory knowledge of DL methods and CV.This review will provide readers with context and examples of how these techniques can be applied to specific areas.A curated list of popular datasets and a brief description of them are also included for the benefit of readers. 展开更多
关键词 Neural network machine vision classification object detection deep learning
下载PDF
YOLO-MFD:Remote Sensing Image Object Detection with Multi-Scale Fusion Dynamic Head
9
作者 Zhongyuan Zhang Wenqiu Zhu 《Computers, Materials & Continua》 SCIE EI 2024年第5期2547-2563,共17页
Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false... Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method. 展开更多
关键词 Object detection YOLOv8 MULTI-SCALE attention mechanism dynamic detection head
下载PDF
Learning Discriminatory Information for Object Detection on Urine Sediment Image
10
作者 Sixian Chan Binghui Wu +2 位作者 Guodao Zhang Yuan Yao Hongqiang Wang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第1期411-428,共18页
In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,... In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,diagnosis and evaluation of kidney and urinary tract disease,providing insight into the specific type and severity.However,manual urine sediment examination is labor-intensive,time-consuming,and subjective.Traditional machine learning based object detection methods require hand-crafted features for localization and classification,which have poor generalization capabilities and are difficult to quickly and accurately detect the number of urine sediments.Deep learning based object detection methods have the potential to address the challenges mentioned above,but these methods require access to large urine sediment image datasets.Unfortunately,only a limited number of publicly available urine sediment datasets are currently available.To alleviate the lack of urine sediment datasets in medical image analysis,we propose a new dataset named UriSed2K,which contains 2465 high-quality images annotated with expert guidance.Two main challenges are associated with our dataset:a large number of small objects and the occlusion between these small objects.Our manuscript focuses on applying deep learning object detection methods to the urine sediment dataset and addressing the challenges presented by this dataset.Specifically,our goal is to improve the accuracy and efficiency of the detection algorithm and,in doing so,provide medical professionals with an automatic detector that saves time and effort.We propose an improved lightweight one-stage object detection algorithm called Discriminatory-YOLO.The proposed algorithm comprises a local context attention module and a global background suppression module,which aid the detector in distinguishing urine sediment features in the image.The local context attention module captures context information beyond the object region,while the global background suppression module emphasizes objects in uninformative backgrounds.We comprehensively evaluate our method on the UriSed2K dataset,which includes seven categories of urine sediments,such as erythrocytes(red blood cells),leukocytes(white blood cells),epithelial cells,crystals,mycetes,broken erythrocytes,and broken leukocytes,achieving the best average precision(AP)of 95.3%while taking only 10 ms per image.The source code and dataset are available at https://github.com/binghuiwu98/discriminatoryyolov5. 展开更多
关键词 Object detection attention mechanism medical image urine sediment
下载PDF
Intelligent Recognition Using Ultralight Multifunctional Nano‑Layered Carbon Aerogel Sensors with Human‑Like Tactile Perception
11
作者 Huiqi Zhao Yizheng Zhang +8 位作者 Lei Han Weiqi Qian Jiabin Wang Heting Wu Jingchen Li Yuan Dai Zhengyou Zhang Chris RBowen Ya Yang 《Nano-Micro Letters》 SCIE EI CAS CSCD 2024年第1期172-186,共15页
Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this uniq... Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this unique capability in robots remains a significant challenge.Here,we present a new form of ultralight multifunctional tactile nano-layered carbon aerogel sensor that provides pressure,temperature,material recognition and 3D location capabilities,which is combined with multimodal supervised learning algorithms for object recognition.The sensor exhibits human-like pressure(0.04–100 kPa)and temperature(21.5–66.2℃)detection,millisecond response times(11 ms),a pressure sensitivity of 92.22 kPa^(−1)and triboelectric durability of over 6000 cycles.The devised algorithm has universality and can accommodate a range of application scenarios.The tactile system can identify common foods in a kitchen scene with 94.63%accuracy and explore the topographic and geomorphic features of a Mars scene with 100%accuracy.This sensing approach empowers robots with versatile tactile perception to advance future society toward heightened sensing,recognition and intelligence. 展开更多
关键词 Multifunctional sensor Tactile perception Multimodal machine learning algorithms Universal tactile system Intelligent object recognition
下载PDF
General Optimal Trajectory Planning:Enabling Autonomous Vehicles with the Principle of Least Action
12
作者 Heye Huang Yicong Liu +4 位作者 Jinxin Liu Qisong Yang Jianqiang Wang David Abbink Arkady Zgonnikov 《Engineering》 SCIE EI CAS CSCD 2024年第2期63-76,共14页
This study presents a general optimal trajectory planning(GOTP)framework for autonomous vehicles(AVs)that can effectively avoid obstacles and guide AVs to complete driving tasks safely and efficiently.Firstly,we emplo... This study presents a general optimal trajectory planning(GOTP)framework for autonomous vehicles(AVs)that can effectively avoid obstacles and guide AVs to complete driving tasks safely and efficiently.Firstly,we employ the fifth-order Bezier curve to generate and smooth the reference path along the road centerline.Cartesian coordinates are then transformed to achieve the curvature continuity of the generated curve.Considering the road constraints and vehicle dynamics,limited polynomial candidate trajectories are generated and smoothed in a curvilinear coordinate system.Furthermore,in selecting the optimal trajectory,we develop a unified and auto-tune objective function based on the principle of least action by employing AVs to simulate drivers’behavior and summarizing their manipulation characteristics of“seeking benefits and avoiding losses.”Finally,by integrating the idea of receding-horizon optimization,the proposed framework is achieved by considering dynamic multi-performance objectives and selecting trajectories that satisfy feasibility,optimality,and adaptability.Extensive simulations and experiments are performed,and the results demonstrate the framework’s feasibility and effectiveness,which avoids both dynamic and static obstacles and applies to various scenarios with multi-source interactive traffic participants.Moreover,we prove that the proposed method can guarantee real-time planning and safety requirements compared to drivers’manipulation. 展开更多
关键词 Autonomous vehicle Trajectory planning Multi-performance objectives Principle of least action
下载PDF
Coal/Gangue Volume Estimation with Convolutional Neural Network and Separation Based on Predicted Volume and Weight
13
作者 Zenglun Guan Murad S.Alfarzaeai +2 位作者 Eryi Hu Taqiaden Alshmeri Wang Peng 《Computers, Materials & Continua》 SCIE EI 2024年第4期279-306,共28页
In the coal mining industry,the gangue separation phase imposes a key challenge due to the high visual similaritybetween coal and gangue.Recently,separation methods have become more intelligent and efficient,using new... In the coal mining industry,the gangue separation phase imposes a key challenge due to the high visual similaritybetween coal and gangue.Recently,separation methods have become more intelligent and efficient,using newtechnologies and applying different features for recognition.One such method exploits the difference in substancedensity,leading to excellent coal/gangue recognition.Therefore,this study uses density differences to distinguishcoal from gangue by performing volume prediction on the samples.Our training samples maintain a record of3-side images as input,volume,and weight as the ground truth for the classification.The prediction process relieson a Convolutional neural network(CGVP-CNN)model that receives an input of a 3-side image and then extractsthe needed features to estimate an approximation for the volume.The classification was comparatively performedvia ten different classifiers,namely,K-Nearest Neighbors(KNN),Linear Support Vector Machines(Linear SVM),Radial Basis Function(RBF)SVM,Gaussian Process,Decision Tree,Random Forest,Multi-Layer Perceptron(MLP),Adaptive Boosting(AdaBosst),Naive Bayes,and Quadratic Discriminant Analysis(QDA).After severalexperiments on testing and training data,results yield a classification accuracy of 100%,92%,95%,96%,100%,100%,100%,96%,81%,and 92%,respectively.The test reveals the best timing with KNN,which maintained anaccuracy level of 100%.Assessing themodel generalization capability to newdata is essential to ensure the efficiencyof the model,so by applying a cross-validation experiment,the model generalization was measured.The useddataset was isolated based on the volume values to ensure the model generalization not only on new images of thesame volume but with a volume outside the trained range.Then,the predicted volume values were passed to theclassifiers group,where classification reported accuracy was found to be(100%,100%,100%,98%,88%,87%,100%,87%,97%,100%),respectively.Although obtaining a classification with high accuracy is the main motive,this workhas a remarkable reduction in the data preprocessing time compared to related works.The CGVP-CNN modelmanaged to reduce the data preprocessing time of previous works to 0.017 s while maintaining high classificationaccuracy using the estimated volume value. 展开更多
关键词 COAL coal gangue convolutional neural network CNN object classification volume estimation separation system
下载PDF
SAM Era:Can It Segment Any Industrial Surface Defects?
14
作者 Kechen Song Wenqi Cui +2 位作者 Han Yu Xingjie Li Yunhui Yan 《Computers, Materials & Continua》 SCIE EI 2024年第3期3953-3969,共17页
Segment Anything Model(SAM)is a cutting-edge model that has shown impressive performance in general object segmentation.The birth of the segment anything is a groundbreaking step towards creating a universal intellige... Segment Anything Model(SAM)is a cutting-edge model that has shown impressive performance in general object segmentation.The birth of the segment anything is a groundbreaking step towards creating a universal intelligent model.Due to its superior performance in general object segmentation,it quickly gained attention and interest.This makes SAM particularly attractive in industrial surface defect segmentation,especially for complex industrial scenes with limited training data.However,its segmentation ability for specific industrial scenes remains unknown.Therefore,in this work,we select three representative and complex industrial surface defect detection scenarios,namely strip steel surface defects,tile surface defects,and rail surface defects,to evaluate the segmentation performance of SAM.Our results show that although SAM has great potential in general object segmentation,it cannot achieve satisfactory performance in complex industrial scenes.Our test results are available at:https://github.com/VDT-2048/SAM-IS. 展开更多
关键词 Segment anything SAM surface defect detection salient object detection
下载PDF
A Secure and Cost-Effective Training Framework Atop Serverless Computing for Object Detection in Blasting
15
作者 Tianming Zhang Zebin Chen +4 位作者 Haonan Guo Bojun Ren Quanmin Xie Mengke Tian Yong Wang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第5期2139-2154,共16页
The data analysis of blasting sites has always been the research goal of relevant researchers.The rise of mobile blasting robots has aroused many researchers’interest in machine learning methods for target detection ... The data analysis of blasting sites has always been the research goal of relevant researchers.The rise of mobile blasting robots has aroused many researchers’interest in machine learning methods for target detection in the field of blasting.Serverless Computing can provide a variety of computing services for people without hardware foundations and rich software development experience,which has aroused people’s interest in how to use it in the field ofmachine learning.In this paper,we design a distributedmachine learning training application based on the AWS Lambda platform.Based on data parallelism,the data aggregation and training synchronization in Function as a Service(FaaS)are effectively realized.It also encrypts the data set,effectively reducing the risk of data leakage.We rent a cloud server and a Lambda,and then we conduct experiments to evaluate our applications.Our results indicate the effectiveness,rapidity,and economy of distributed training on FaaS. 展开更多
关键词 Serverless computing object detection BLASTING
下载PDF
Optimal Positioning Strategy for Multi-Camera Zooming Drones
16
作者 Manuel Vargas Carlos Vivas Teodoro Alamo 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第8期1802-1818,共17页
In the context of multiple-target tracking and surveillance applications,this paper investigates the challenge of determining the optimal positioning of a single autonomous aerial vehicle or agent equipped with multip... In the context of multiple-target tracking and surveillance applications,this paper investigates the challenge of determining the optimal positioning of a single autonomous aerial vehicle or agent equipped with multiple independently-steerable zooming cameras to effectively monitor a set of targets of interest.Each camera is dedicated to tracking a specific target or cluster of targets.The key innovation of this study,in comparison to existing approaches,lies in incorporating the zooming factor for the onboard cameras into the optimization problem.This enhancement offers greater flexibility during mission execution by allowing the autonomous agent to adjust the focal lengths of the onboard cameras,in exchange for varying real-world distances to the corresponding targets,thereby providing additional degrees of freedom to the optimization problem.The proposed optimization framework aims to strike a balance among various factors,including distance to the targets,verticality of viewpoints,and the required focal length for each camera.The primary focus of this paper is to establish the theoretical groundwork for addressing the non-convex nature of the optimization problem arising from these considerations.To this end,we develop an original convex approximation strategy.The paper also includes simulations of diverse scenarios,featuring varying numbers of onboard tracking cameras and target motion profiles,to validate the effectiveness of the proposed approach. 展开更多
关键词 Convex optimization projective transformation unmanned aerial vehicle visual object tracking visual surveillance.
下载PDF
An object detection approach with residual feature fusion and second-order term attention mechanism
17
作者 Cuijin Li Zhong Qu Shengye Wang 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第2期411-424,共14页
Automatically detecting and locating remote occlusion small objects from the images of complex traffic environments is a valuable and challenging research.Since the boundary box location is not sufficiently accurate a... Automatically detecting and locating remote occlusion small objects from the images of complex traffic environments is a valuable and challenging research.Since the boundary box location is not sufficiently accurate and it is difficult to distinguish overlapping and occluded objects,the authors propose a network model with a second-order term attention mechanism and occlusion loss.First,the backbone network is built on CSPDarkNet53.Then a method is designed for the feature extraction network based on an item-wise attention mechanism,which uses the filtered weighted feature vector to replace the original residual fusion and adds a second-order term to reduce the information loss in the process of fusion and accelerate the convergence of the model.Finally,an objected occlusion regression loss function is studied to reduce the problems of missed detections caused by dense objects.Sufficient experimental results demonstrate that the authors’method achieved state-of-the-art performance without reducing the detection speed.The mAP@.5 of the method is 85.8%on the Foggy_cityscapes dataset and the mAP@.5 of the method is 97.8%on the KITTI dataset. 展开更多
关键词 artificial intelligence computer vision image processing machine learning neural network object recognition
下载PDF
Probability-Enhanced Anchor-Free Detector for Remote-Sensing Object Detection
18
作者 Chengcheng Fan Zhiruo Fang 《Computers, Materials & Continua》 SCIE EI 2024年第6期4925-4943,共19页
Anchor-free object-detection methods achieve a significant advancement in field of computer vision,particularly in the realm of real-time inferences.However,in remote sensing object detection,anchor-free methods often... Anchor-free object-detection methods achieve a significant advancement in field of computer vision,particularly in the realm of real-time inferences.However,in remote sensing object detection,anchor-free methods often lack of capability in separating the foreground and background.This paper proposes an anchor-free method named probability-enhanced anchor-free detector(ProEnDet)for remote sensing object detection.First,a weighted bidirectional feature pyramid is used for feature extraction.Second,we introduce probability enhancement to strengthen the classification of the object’s foreground and background.The detector uses the logarithm likelihood as the final score to improve the classification of the foreground and background of the object.ProEnDet is verified using the DIOR and NWPU-VHR-10 datasets.The experiment achieved mean average precisions of 61.4 and 69.0 on the DIOR dataset and NWPU-VHR-10 dataset,respectively.ProEnDet achieves a speed of 32.4 FPS on the DIOR dataset,which satisfies the real-time requirements for remote-sensing object detection. 展开更多
关键词 Object detection anchor-free detector PROBABILISTIC fully convolutional neural network remote sensing
下载PDF
YOLOv5ST:A Lightweight and Fast Scene Text Detector
19
作者 Yiwei Liu Yingnan Zhao +2 位作者 Yi Chen Zheng Hu Min Xia 《Computers, Materials & Continua》 SCIE EI 2024年第4期909-926,共18页
Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal ... Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal is to enhance inference speed without sacrificing significant detection accuracy,thereby enabling robust performance on resource-constrained devices like drones,closed-circuit television cameras,and other embedded systems.To achieve this,we propose key modifications to the network architecture to lighten the original backbone and improve feature aggregation,including replacing standard convolution with depth-wise convolution,adopting the C2 sequence module in place of C3,employing Spatial Pyramid Pooling Global(SPPG)instead of Spatial Pyramid Pooling Fast(SPPF)and integrating Bi-directional Feature Pyramid Network(BiFPN)into the neck.Experimental results demonstrate a remarkable 26%improvement in inference speed compared to the baseline,with only marginal reductions of 1.6%and 4.2%in mean average precision(mAP)at the intersection over union(IoU)thresholds of 0.5 and 0.5:0.95,respectively.Our work represents a significant advancement in scene text detection,striking a balance between speed and accuracy,making it well-suited for performance-constrained environments. 展开更多
关键词 Scene text detection YOLOv5 LIGHTWEIGHT object detection
下载PDF
CAW-YOLO:Cross-Layer Fusion and Weighted Receptive Field-Based YOLO for Small Object Detection in Remote Sensing
20
作者 Weiya Shi Shaowen Zhang Shiqiang Zhang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第6期3209-3231,共23页
In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in re... In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in remote sensing remains a formidable challenge.The deep network structure will bring about the loss of object features,resulting in the loss of object features and the near elimination of some subtle features associated with small objects in deep layers.Additionally,the features of small objects are susceptible to interference from background features contained within the image,leading to a decline in detection accuracy.Moreover,the sensitivity of small objects to the bounding box perturbation further increases the detection difficulty.In this paper,we introduce a novel approach,Cross-Layer Fusion and Weighted Receptive Field-based YOLO(CAW-YOLO),specifically designed for small object detection in remote sensing.To address feature loss in deep layers,we have devised a cross-layer attention fusion module.Background noise is effectively filtered through the incorporation of Bi-Level Routing Attention(BRA).To enhance the model’s capacity to perceive multi-scale objects,particularly small-scale objects,we introduce a weightedmulti-receptive field atrous spatial pyramid poolingmodule.Furthermore,wemitigate the sensitivity arising from bounding box perturbation by incorporating the joint Normalized Wasserstein Distance(NWD)and Efficient Intersection over Union(EIoU)losses.The efficacy of the proposedmodel in detecting small objects in remote sensing has been validated through experiments conducted on three publicly available datasets.The experimental results unequivocally demonstrate the model’s pronounced advantages in small object detection for remote sensing,surpassing the performance of current mainstream models. 展开更多
关键词 Small object detection attention mechanism cross-layer fusion discrete cosine transform
下载PDF
上一页 1 2 146 下一页 到第
使用帮助 返回顶部