期刊文献+
共找到3,257篇文章
< 1 2 163 >
每页显示 20 50 100
基于改进Deformable DETR的无人机视频流车辆目标检测算法
1
作者 江志鹏 王自全 +4 位作者 张永生 于英 程彬彬 赵龙海 张梦唯 《计算机工程与科学》 CSCD 北大核心 2024年第1期91-101,共11页
针对无人机视频流检测中小目标数量多、因图像传输质量较低而导致的上下文语义信息不充分、传统算法融合特征推理速度慢、数据集类别样本不均衡导致的训练效果差等问题,提出一种基于改进Deformable DETR的无人机视频流车辆目标检测算法... 针对无人机视频流检测中小目标数量多、因图像传输质量较低而导致的上下文语义信息不充分、传统算法融合特征推理速度慢、数据集类别样本不均衡导致的训练效果差等问题,提出一种基于改进Deformable DETR的无人机视频流车辆目标检测算法。在模型结构方面,该算法设计了跨尺度特征融合模块以增大感受野,提升小目标检测能力,并采用针对object_query的挤压-激励模块提升关键目标的响应值,减少重要目标的漏检与错检率;在数据处理方面,使用了在线困难样本挖掘技术,改善数据集中类别样本分布不均的问题。在UAVDT数据集上进行了实验,实验结果表明,改进后的算法相较于基线算法在平均检测精度上提升了1.5%,在小目标检测精度上提升了0.8%,并在保持参数量较少增长的情况下,维持了原有的检测速度。 展开更多
关键词 Deformable DETR 目标检测 跨尺度特征融合模块 object query挤压-激励 在线难样本挖掘
下载PDF
Road Traffic Monitoring from Aerial Images Using Template Matching and Invariant Features 被引量:1
2
作者 Asifa Mehmood Qureshi Naif Al Mudawi +2 位作者 Mohammed Alonazi Samia Allaoua Chelloug Jeongmin Park 《Computers, Materials & Continua》 SCIE EI 2024年第3期3683-3701,共19页
Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibilit... Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved. 展开更多
关键词 Unmanned Aerial Vehicles(UAV) aerial images DATASET object detection object tracking data elimination template matching blob detection SIFT VAID
下载PDF
Automatic detection of small bowel lesions with different bleeding risks based on deep learning models 被引量:1
3
作者 Rui-Ya Zhang Peng-Peng Qiang +5 位作者 Ling-Jun Cai Tao Li Yan Qin Yu Zhang Yi-Qing Zhao Jun-Ping Wang 《World Journal of Gastroenterology》 SCIE CAS 2024年第2期170-183,共14页
BACKGROUND Deep learning provides an efficient automatic image recognition method for small bowel(SB)capsule endoscopy(CE)that can assist physicians in diagnosis.However,the existing deep learning models present some ... BACKGROUND Deep learning provides an efficient automatic image recognition method for small bowel(SB)capsule endoscopy(CE)that can assist physicians in diagnosis.However,the existing deep learning models present some unresolved challenges.AIM To propose a novel and effective classification and detection model to automatically identify various SB lesions and their bleeding risks,and label the lesions accurately so as to enhance the diagnostic efficiency of physicians and the ability to identify high-risk bleeding groups.METHODS The proposed model represents a two-stage method that combined image classification with object detection.First,we utilized the improved ResNet-50 classification model to classify endoscopic images into SB lesion images,normal SB mucosa images,and invalid images.Then,the improved YOLO-V5 detection model was utilized to detect the type of lesion and its risk of bleeding,and the location of the lesion was marked.We constructed training and testing sets and compared model-assisted reading with physician reading.RESULTS The accuracy of the model constructed in this study reached 98.96%,which was higher than the accuracy of other systems using only a single module.The sensitivity,specificity,and accuracy of the model-assisted reading detection of all images were 99.17%,99.92%,and 99.86%,which were significantly higher than those of the endoscopists’diagnoses.The image processing time of the model was 48 ms/image,and the image processing time of the physicians was 0.40±0.24 s/image(P<0.001).CONCLUSION The deep learning model of image classification combined with object detection exhibits a satisfactory diagnostic effect on a variety of SB lesions and their bleeding risks in CE images,which enhances the diagnostic efficiency of physicians and improves the ability of physicians to identify high-risk bleeding groups. 展开更多
关键词 Artificial intelligence Deep learning Capsule endoscopy Image classification Object detection Bleeding risk
下载PDF
An Underwater Target Detection Algorithm Based on Attention Mechanism and Improved YOLOv7 被引量:1
4
作者 Liqiu Ren Zhanying Li +2 位作者 Xueyu He Lingyan Kong Yinghao Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第2期2829-2845,共17页
For underwater robots in the process of performing target detection tasks,the color distortion and the uneven quality of underwater images lead to great difficulties in the feature extraction process of the model,whic... For underwater robots in the process of performing target detection tasks,the color distortion and the uneven quality of underwater images lead to great difficulties in the feature extraction process of the model,which is prone to issues like error detection,omission detection,and poor accuracy.Therefore,this paper proposed the CER-YOLOv7(CBAM-EIOU-RepVGG-YOLOv7)underwater target detection algorithm.To improve the algorithm’s capability to retain valid features from both spatial and channel perspectives during the feature extraction phase,we have added a Convolutional Block Attention Module(CBAM)to the backbone network.The Reparameterization Visual Geometry Group(RepVGG)module is inserted into the backbone to improve the training and inference capabilities.The Efficient Intersection over Union(EIoU)loss is also used as the localization loss function,which reduces the error detection rate and missed detection rate of the algorithm.The experimental results of the CER-YOLOv7 algorithm on the UPRC(Underwater Robot Prototype Competition)dataset show that the mAP(mean Average Precision)score of the algorithm is 86.1%,which is a 2.2%improvement compared to the YOLOv7.The feasibility and validity of the CER-YOLOv7 are proved through ablation and comparison experiments,and it is more suitable for underwater target detection. 展开更多
关键词 Deep learning underwater object detection improved YOLOv7 attention mechanism
下载PDF
Enhancing Dense Small Object Detection in UAV Images Based on Hybrid Transformer 被引量:1
5
作者 Changfeng Feng Chunping Wang +2 位作者 Dongdong Zhang Renke Kou Qiang Fu 《Computers, Materials & Continua》 SCIE EI 2024年第3期3993-4013,共21页
Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unman... Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection. 展开更多
关键词 UAV images TRANSFORMER dense small object detection
下载PDF
Human intrusion detection for high-speed railway perimeter under all-weather condition 被引量:1
6
作者 Pengyue Guo Tianyun Shi +1 位作者 Zhen Ma Jing Wang 《Railway Sciences》 2024年第1期97-110,共14页
Purpose – The paper aims to solve the problem of personnel intrusion identification within the limits of highspeed railways. It adopts the fusion method of millimeter wave radar and camera to improve the accuracy ofo... Purpose – The paper aims to solve the problem of personnel intrusion identification within the limits of highspeed railways. It adopts the fusion method of millimeter wave radar and camera to improve the accuracy ofobject recognition in dark and harsh weather conditions.Design/methodology/approach – This paper adopts the fusion strategy of radar and camera linkage toachieve focus amplification of long-distance targets and solves the problem of low illumination by laser lightfilling of the focus point. In order to improve the recognition effect, this paper adopts the YOLOv8 algorithm formulti-scale target recognition. In addition, for the image distortion caused by bad weather, this paper proposesa linkage and tracking fusion strategy to output the correct alarm results.Findings – Simulated intrusion tests show that the proposed method can effectively detect human intrusionwithin 0–200 m during the day and night in sunny weather and can achieve more than 80% recognitionaccuracy for extreme severe weather conditions.Originality/value – (1) The authors propose a personnel intrusion monitoring scheme based on the fusion ofmillimeter wave radar and camera, achieving all-weather intrusion monitoring;(2) The authors propose a newmulti-level fusion algorithm based on linkage and tracking to achieve intrusion target monitoring underadverse weather conditions;(3) The authors have conducted a large number of innovative simulationexperiments to verify the effectiveness of the method proposed in this article. 展开更多
关键词 High-speed rail perimeter Personnel invasion Object detection ALL-WEATHER Radar-camera fusion
下载PDF
Efficient Ship:A Hybrid Deep Learning Framework for Ship Detection in the River
7
作者 Huafeng Chen Junxing Xue +2 位作者 Hanyun Wen Yurong Hu Yudong Zhang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第1期301-320,共20页
Optical image-based ship detection can ensure the safety of ships and promote the orderly management of ships in offshore waters.Current deep learning researches on optical image-based ship detection mainly focus on i... Optical image-based ship detection can ensure the safety of ships and promote the orderly management of ships in offshore waters.Current deep learning researches on optical image-based ship detection mainly focus on improving one-stage detectors for real-time ship detection but sacrifices the accuracy of detection.To solve this problem,we present a hybrid ship detection framework which is named EfficientShip in this paper.The core parts of the EfficientShip are DLA-backboned object location(DBOL)and CascadeRCNN-guided object classification(CROC).The DBOL is responsible for finding potential ship objects,and the CROC is used to categorize the potential ship objects.We also design a pixel-spatial-level data augmentation(PSDA)to reduce the risk of detection model overfitting.We compare the proposed EfficientShip with state-of-the-art(SOTA)literature on a ship detection dataset called Seaships.Experiments show our ship detection framework achieves a result of 99.63%(mAP)at 45 fps,which is much better than 8 SOTA approaches on detection accuracy and can also meet the requirements of real-time application scenarios. 展开更多
关键词 Ship detection deep learning data augmentation object location object classification
下载PDF
Enhanced Object Detection and Classification via Multi-Method Fusion
8
作者 Muhammad Waqas Ahmed Nouf Abdullah Almujally +2 位作者 Abdulwahab Alazeb Asaad Algarni Jeongmin Park 《Computers, Materials & Continua》 SCIE EI 2024年第5期3315-3331,共17页
Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occ... Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occlusion,and limited labeled data.To address these challenges,we introduce a comprehensive methodology toenhance image classification and object detection accuracy.The proposed approach involves the integration ofmultiple methods in a complementary way.The process commences with the application of Gaussian filters tomitigate the impact of noise interference.These images are then processed for segmentation using Fuzzy C-Meanssegmentation in parallel with saliency mapping techniques to find the most prominent regions.The Binary RobustIndependent Elementary Features(BRIEF)characteristics are then extracted fromdata derived fromsaliency mapsand segmented images.For precise object separation,Oriented FAST and Rotated BRIEF(ORB)algorithms areemployed.Genetic Algorithms(GAs)are used to optimize Random Forest classifier parameters which lead toimproved performance.Our method stands out due to its comprehensive approach,adeptly addressing challengessuch as changing backdrops,occlusion,and limited labeled data concurrently.A significant enhancement hasbeen achieved by integrating Genetic Algorithms(GAs)to precisely optimize parameters.This minor adjustmentnot only boosts the uniqueness of our system but also amplifies its overall efficacy.The proposed methodologyhas demonstrated notable classification accuracies of 90.9%and 89.0%on the challenging Corel-1k and MSRCdatasets,respectively.Furthermore,detection accuracies of 87.2%and 86.6%have been attained.Although ourmethod performed well in both datasets it may face difficulties in real-world data especially where datasets havehighly complex backgrounds.Despite these limitations,GAintegration for parameter optimization shows a notablestrength in enhancing the overall adaptability and performance of our system. 展开更多
关键词 BRIEF features saliency map fuzzy c-means object detection object recognition
下载PDF
Confusing Object Detection:A Survey
9
作者 Kunkun Tong Guchu Zou +5 位作者 Xin Tan Jingyu Gong Zhenyi Qi Zhizhong Zhang Yuan Xie Lizhuang Ma 《Computers, Materials & Continua》 SCIE EI 2024年第9期3421-3461,共41页
Confusing object detection(COD),such as glass,mirrors,and camouflaged objects,represents a burgeoning visual detection task centered on pinpointing and distinguishing concealed targets within intricate backgrounds,lev... Confusing object detection(COD),such as glass,mirrors,and camouflaged objects,represents a burgeoning visual detection task centered on pinpointing and distinguishing concealed targets within intricate backgrounds,leveraging deep learning methodologies.Despite garnering increasing attention in computer vision,the focus of most existing works leans toward formulating task-specific solutions rather than delving into in-depth analyses of methodological structures.As of now,there is a notable absence of a comprehensive systematic review that focuses on recently proposed deep learning-based models for these specific tasks.To fill this gap,our study presents a pioneering review that covers both themodels and the publicly available benchmark datasets,while also identifying potential directions for future research in this field.The current dataset primarily focuses on single confusing object detection at the image level,with some studies extending to video-level data.We conduct an in-depth analysis of deep learning architectures,revealing that the current state-of-the-art(SOTA)COD methods demonstrate promising performance in single object detection.We also compile and provide detailed descriptions ofwidely used datasets relevant to these detection tasks.Our endeavor extends to discussing the limitations observed in current methodologies,alongside proposed solutions aimed at enhancing detection accuracy.Additionally,we deliberate on relevant applications and outline future research trajectories,aiming to catalyze advancements in the field of glass,mirror,and camouflaged object detection. 展开更多
关键词 Confusing object detection mirror detection glass detection camouflaged object detection deep learning
下载PDF
Multi-granularity feature enhancement network for maritime ship detection
10
作者 Li Ying Duoqian Miao +2 位作者 Zhifei Zhang Hongyun Zhang Witold Pedrycz 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第3期649-664,共16页
Due to the characteristics of high resolution and rich texture information,visible light images are widely used for maritime ship detection.However,these images are suscep-tible to sea fog and ships of different sizes... Due to the characteristics of high resolution and rich texture information,visible light images are widely used for maritime ship detection.However,these images are suscep-tible to sea fog and ships of different sizes,which can result in missed detections and false alarms,ultimately resulting in lower detection accuracy.To address these issues,a novel multi-granularity feature enhancement network,MFENet,which includes a three-way dehazing module(3WDM)and a multi-granularity feature enhancement module(MFEM)is proposed.The 3WDM eliminates sea fog interference by using an image clarity automatic classification algorithm based on three-way decisions and FFA-Net to obtain clear image samples.Additionally,the MFEM improves the accuracy of detecting ships of different sizes by utilising an improved super-resolution reconstruction con-volutional neural network to enhance the resolution and semantic representation capa-bility of the feature maps from YOLOv7.Experimental results demonstrate that MFENet surpasses the other 15 competing models in terms of the mean Average Pre-cision metric on two benchmark datasets,achieving 96.28%on the McShips dataset and 97.71%on the SeaShips dataset. 展开更多
关键词 object classification object recognition rough sets rough set theory
下载PDF
Floating Waste Discovery by Request via Object-Centric Learning
11
作者 Bingfei Fu 《Computers, Materials & Continua》 SCIE EI 2024年第7期1407-1424,共18页
Discovering floating wastes,especially bottles on water,is a crucial research problem in environmental hygiene.Nevertheless,real-world applications often face challenges such as interference from irrelevant objects an... Discovering floating wastes,especially bottles on water,is a crucial research problem in environmental hygiene.Nevertheless,real-world applications often face challenges such as interference from irrelevant objects and the high cost associated with data collection.Consequently,devising algorithms capable of accurately localizing specific objects within a scene in scenarios where annotated data is limited remains a formidable challenge.To solve this problem,this paper proposes an object discovery by request problem setting and a corresponding algorithmic framework.The proposed problem setting aims to identify specified objects in scenes,and the associated algorithmic framework comprises pseudo data generation and object discovery by request network.Pseudo-data generation generates images resembling natural scenes through various data augmentation rules,using a small number of object samples and scene images.The network structure of object discovery by request utilizes the pre-trained Vision Transformer(ViT)model as the backbone,employs object-centric methods to learn the latent representations of foreground objects,and applies patch-level reconstruction constraints to the model.During the validation phase,we use the generated pseudo datasets as training sets and evaluate the performance of our model on the original test sets.Experiments have proved that our method achieves state-of-the-art performance on Unmanned Aerial Vehicles-Bottle Detection(UAV-BD)dataset and self-constructed dataset Bottle,especially in multi-object scenarios. 展开更多
关键词 Unsupervised object discovery object-centric learning pseudo data generation real-world object discovery by request
下载PDF
Rail-Pillar Net:A 3D Detection Network for Railway Foreign Object Based on LiDAR
12
作者 Fan Li Shuyao Zhang +2 位作者 Jie Yang Zhicheng Feng Zhichao Chen 《Computers, Materials & Continua》 SCIE EI 2024年第9期3819-3833,共15页
Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,w... Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,we propose Rail-PillarNet,a three-dimensional(3D)LIDAR(Light Detection and Ranging)railway foreign object detection method based on the improvement of PointPillars.Firstly,the parallel attention pillar encoder(PAPE)is designed to fully extract the features of the pillars and alleviate the problem of local fine-grained information loss in PointPillars pillars encoder.Secondly,a fine backbone network is designed to improve the feature extraction capability of the network by combining the coding characteristics of LIDAR point cloud feature and residual structure.Finally,the initial weight parameters of the model were optimised by the transfer learning training method to further improve accuracy.The experimental results on the OSDaR23 dataset show that the average accuracy of Rail-PillarNet reaches 58.51%,which is higher than most mainstream models,and the number of parameters is 5.49 M.Compared with PointPillars,the accuracy of each target is improved by 10.94%,3.53%,16.96%and 19.90%,respectively,and the number of parameters only increases by 0.64M,which achieves a balance between the number of parameters and accuracy. 展开更多
关键词 Railway foreign object light detection and ranging(LiDAR) 3D object detection PointPillars parallel attention mechanism transfer learning
下载PDF
Two-Layer Attention Feature Pyramid Network for Small Object Detection
13
作者 Sheng Xiang Junhao Ma +2 位作者 Qunli Shang Xianbao Wang Defu Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期713-731,共19页
Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain les... Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors. 展开更多
关键词 Small object detection two-layer attention module small object detail enhancement module feature pyramid network
下载PDF
Construction Activity Analysis of Workers Based on Human Posture Estimation Information
14
作者 Xuhong Zhou Shuai Li +2 位作者 Jiepeng Liu Zhou Wu Yohchia Frank Chen 《Engineering》 SCIE EI CAS CSCD 2024年第2期225-236,共12页
Identifying workers’construction activities or behaviors can enable managers to better monitor labor efficiency and construction progress.However,current activity analysis methods for construction workers rely solely... Identifying workers’construction activities or behaviors can enable managers to better monitor labor efficiency and construction progress.However,current activity analysis methods for construction workers rely solely on manual observations and recordings,which consumes considerable time and has high labor costs.Researchers have focused on monitoring on-site construction activities of workers.However,when multiple workers are working together,current research cannot accu rately and automatically identify the construction activity.This research proposes a deep learning framework for the automated analysis of the construction activities of multiple workers.In this framework,multiple deep neural network models are designed and used to complete worker key point extraction,worker tracking,and worker construction activity analysis.The designed framework was tested at an actual construction site,and activity recognition for multiple workers was performed,indicating the feasibility of the framework for the automated monitoring of work efficiency. 展开更多
关键词 Pose estimation Activity analysis Object tracking Construction workers Automatic systems
下载PDF
Exploring Deep Learning Methods for Computer Vision Applications across Multiple Sectors:Challenges and Future Trends
15
作者 Narayanan Ganesh Rajendran Shankar +3 位作者 Miroslav Mahdal Janakiraman SenthilMurugan Jasgurpreet Singh Chohan Kanak Kalita 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期103-141,共39页
Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than ot... Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than other traditional machine learning(ML)methods inCV.DL techniques can produce state-of-the-art results for difficult CV problems like picture categorization,object detection,and face recognition.In this review,a structured discussion on the history,methods,and applications of DL methods to CV problems is presented.The sector-wise presentation of applications in this papermay be particularly useful for researchers in niche fields who have limited or introductory knowledge of DL methods and CV.This review will provide readers with context and examples of how these techniques can be applied to specific areas.A curated list of popular datasets and a brief description of them are also included for the benefit of readers. 展开更多
关键词 Neural network machine vision classification object detection deep learning
下载PDF
Multimodal fusion recognition for digital twin
16
作者 Tianzhe Zhou Xuguang Zhang +1 位作者 Bing Kang Mingkai Chen 《Digital Communications and Networks》 SCIE CSCD 2024年第2期337-346,共10页
The digital twin is the concept of transcending reality,which is the reverse feedback from the real physical space to the virtual digital space.People hold great prospects for this emerging technology.In order to real... The digital twin is the concept of transcending reality,which is the reverse feedback from the real physical space to the virtual digital space.People hold great prospects for this emerging technology.In order to realize the upgrading of the digital twin industrial chain,it is urgent to introduce more modalities,such as vision,haptics,hearing and smell,into the virtual digital space,which assists physical entities and virtual objects in creating a closer connection.Therefore,perceptual understanding and object recognition have become an urgent hot topic in the digital twin.Existing surface material classification schemes often achieve recognition through machine learning or deep learning in a single modality,ignoring the complementarity between multiple modalities.In order to overcome this dilemma,we propose a multimodal fusion network in our article that combines two modalities,visual and haptic,for surface material recognition.On the one hand,the network makes full use of the potential correlations between multiple modalities to deeply mine the modal semantics and complete the data mapping.On the other hand,the network is extensible and can be used as a universal architecture to include more modalities.Experiments show that the constructed multimodal fusion network can achieve 99.42%classification accuracy while reducing complexity. 展开更多
关键词 Digital twin Multimodal fusion Object recognition Deep learning Transfer learning
下载PDF
YOLO-MFD:Remote Sensing Image Object Detection with Multi-Scale Fusion Dynamic Head
17
作者 Zhongyuan Zhang Wenqiu Zhu 《Computers, Materials & Continua》 SCIE EI 2024年第5期2547-2563,共17页
Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false... Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method. 展开更多
关键词 Object detection YOLOv8 MULTI-SCALE attention mechanism dynamic detection head
下载PDF
SMSTracker:A Self-Calibration Multi-Head Self-Attention Transformer for Visual Object Tracking
18
作者 Zhongyang Wang Hu Zhu Feng Liu 《Computers, Materials & Continua》 SCIE EI 2024年第7期605-623,共19页
Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have becom... Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications. 展开更多
关键词 Visual object tracking tensor decomposition TRANSFORMER self-attention
下载PDF
Learning Discriminatory Information for Object Detection on Urine Sediment Image
19
作者 Sixian Chan Binghui Wu +2 位作者 Guodao Zhang Yuan Yao Hongqiang Wang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第1期411-428,共18页
In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,... In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,diagnosis and evaluation of kidney and urinary tract disease,providing insight into the specific type and severity.However,manual urine sediment examination is labor-intensive,time-consuming,and subjective.Traditional machine learning based object detection methods require hand-crafted features for localization and classification,which have poor generalization capabilities and are difficult to quickly and accurately detect the number of urine sediments.Deep learning based object detection methods have the potential to address the challenges mentioned above,but these methods require access to large urine sediment image datasets.Unfortunately,only a limited number of publicly available urine sediment datasets are currently available.To alleviate the lack of urine sediment datasets in medical image analysis,we propose a new dataset named UriSed2K,which contains 2465 high-quality images annotated with expert guidance.Two main challenges are associated with our dataset:a large number of small objects and the occlusion between these small objects.Our manuscript focuses on applying deep learning object detection methods to the urine sediment dataset and addressing the challenges presented by this dataset.Specifically,our goal is to improve the accuracy and efficiency of the detection algorithm and,in doing so,provide medical professionals with an automatic detector that saves time and effort.We propose an improved lightweight one-stage object detection algorithm called Discriminatory-YOLO.The proposed algorithm comprises a local context attention module and a global background suppression module,which aid the detector in distinguishing urine sediment features in the image.The local context attention module captures context information beyond the object region,while the global background suppression module emphasizes objects in uninformative backgrounds.We comprehensively evaluate our method on the UriSed2K dataset,which includes seven categories of urine sediments,such as erythrocytes(red blood cells),leukocytes(white blood cells),epithelial cells,crystals,mycetes,broken erythrocytes,and broken leukocytes,achieving the best average precision(AP)of 95.3%while taking only 10 ms per image.The source code and dataset are available at https://github.com/binghuiwu98/discriminatoryyolov5. 展开更多
关键词 Object detection attention mechanism medical image urine sediment
下载PDF
Intelligent Recognition Using Ultralight Multifunctional Nano‑Layered Carbon Aerogel Sensors with Human‑Like Tactile Perception
20
作者 Huiqi Zhao Yizheng Zhang +8 位作者 Lei Han Weiqi Qian Jiabin Wang Heting Wu Jingchen Li Yuan Dai Zhengyou Zhang Chris RBowen Ya Yang 《Nano-Micro Letters》 SCIE EI CAS CSCD 2024年第1期172-186,共15页
Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this uniq... Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this unique capability in robots remains a significant challenge.Here,we present a new form of ultralight multifunctional tactile nano-layered carbon aerogel sensor that provides pressure,temperature,material recognition and 3D location capabilities,which is combined with multimodal supervised learning algorithms for object recognition.The sensor exhibits human-like pressure(0.04–100 kPa)and temperature(21.5–66.2℃)detection,millisecond response times(11 ms),a pressure sensitivity of 92.22 kPa^(−1)and triboelectric durability of over 6000 cycles.The devised algorithm has universality and can accommodate a range of application scenarios.The tactile system can identify common foods in a kitchen scene with 94.63%accuracy and explore the topographic and geomorphic features of a Mars scene with 100%accuracy.This sensing approach empowers robots with versatile tactile perception to advance future society toward heightened sensing,recognition and intelligence. 展开更多
关键词 Multifunctional sensor Tactile perception Multimodal machine learning algorithms Universal tactile system Intelligent object recognition
下载PDF
上一页 1 2 163 下一页 到第
使用帮助 返回顶部