期刊文献+
共找到3,608篇文章
< 1 2 181 >
每页显示 20 50 100
ISHD:Intelligent Standing Human Detection of Video Surveillance for the Smart Examination Environment 被引量:1
1
作者 Wu Song Yayuan Tang +1 位作者 Wenxue Tan Sheng Ren 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第10期509-526,共18页
In the environment of smart examination rooms, it is important to quickly and accurately detect abnormal behavior(human standing) for the construction of a smart campus. Based on deep learning, we propose an intellige... In the environment of smart examination rooms, it is important to quickly and accurately detect abnormal behavior(human standing) for the construction of a smart campus. Based on deep learning, we propose an intelligentstanding human detection (ISHD) method based on an improved single shot multibox detector to detect thetarget of standing human posture in the scene frame of exam room video surveillance at a specific examinationstage. ISHD combines the MobileNet network in a single shot multibox detector network, improves the posturefeature extractor of a standing person, merges prior knowledge, and introduces transfer learning in the trainingstrategy, which greatly reduces the computation amount, improves the detection accuracy, and reduces the trainingdifficulty. The experiment proves that the model proposed in this paper has a better detection ability for the smalland medium-sized standing human body posture in video test scenes on the EMV-2 dataset. 展开更多
关键词 Deep learning object detection video surveillance of exam room smart examination environment
下载PDF
Guest Editorial:Intelligent Video Surveillance and Related Technologies
2
作者 Chung-Lin Huang Cheng-Chang Lien +1 位作者 I-Cheng Chang Chih-Yang Lin 《Journal of Electronic Science and Technology》 CAS CSCD 2017年第2期113-114,共2页
Due to the increasing demand for developing a secure and smart living environment, the intelligent video surveillance technology has attracted considerable attention. Building an automatic, reliable, secure, and intel... Due to the increasing demand for developing a secure and smart living environment, the intelligent video surveillance technology has attracted considerable attention. Building an automatic, reliable, secure, and intelligent video surveillance system has spawned large research projects and triggered many popular research topics in several international conferences and workshops recently. This special issue of Journal of ElecWonic Science and Technology (JEST) aims to present recent advances in video surveillance systems which address the observation of people in an environment, leading to a real-time description of their actions and interactions. 展开更多
关键词 IS for been Guest Editorial Intelligent video surveillance and Related Technologies of in BODY that
下载PDF
Weapons Detection for Security and Video Surveillance Using CNN and YOLO-V5s 被引量:1
3
作者 Abdul Hanan Ashraf Muhammad Imran +5 位作者 Abdulrahman M.Qahtani Abdulmajeed Alsufyani Omar Almutiry Awais Mahmood Muhammad Attique Mohamed Habib 《Computers, Materials & Continua》 SCIE EI 2022年第2期2761-2775,共15页
In recent years,the number of Gun-related incidents has crossed over 250,000 per year and over 85%of the existing 1 billion firearms are in civilian hands,manual monitoring has not proven effective in detecting firear... In recent years,the number of Gun-related incidents has crossed over 250,000 per year and over 85%of the existing 1 billion firearms are in civilian hands,manual monitoring has not proven effective in detecting firearms.which is why an automated weapon detection system is needed.Various automated convolutional neural networks(CNN)weapon detection systems have been proposed in the past to generate good results.However,These techniques have high computation overhead and are slow to provide real-time detection which is essential for the weapon detection system.These models have a high rate of false negatives because they often fail to detect the guns due to the low quality and visibility issues of surveillance videos.This research work aims to minimize the rate of false negatives and false positives in weapon detection while keeping the speed of detection as a key parameter.The proposed framework is based on You Only Look Once(YOLO)and Area of Interest(AOI).Initially,themodels take pre-processed frames where the background is removed by the use of the Gaussian blur algorithm.The proposed architecture will be assessed through various performance parameters such as False Negative,False Positive,precision,recall rate,and F1 score.The results of this research work make it clear that due to YOLO-v5s high recall rate and speed of detection are achieved.Speed reached 0.010 s per frame compared to the 0.17 s of the Faster R-CNN.It is promising to be used in the field of security and weapon detection. 展开更多
关键词 video surveillance weapon detection you only look once convolutional neural networks
下载PDF
UTILITY OPTIMIZATION SCHEDULING FOR MULTI-POINT VIDEO SURVEILLANCE IN UBIQUITOUS NETWORK 被引量:1
4
作者 Zhang Chen Huang Liusheng Xu Hongli 《Journal of Electronics(China)》 2013年第1期1-8,共8页
Resource allocation is an important problem in ubiquitous network. Most of the existing resource allocation methods considering only wireless networks are not suitable for the ubiquitous network environment, and they ... Resource allocation is an important problem in ubiquitous network. Most of the existing resource allocation methods considering only wireless networks are not suitable for the ubiquitous network environment, and they will harm the interest of individual users with instable resource requirements. This paper considers the multi-point video surveillance scenarios in a complex network environment with both wired and wireless networks. We introduce the utility estimated by the total costs of an individual network user. The problem is studied through mathematical modeling and we propose an improved problem-specific branch-and-cut algorithm to solve it. The algorithm follows the divide-and-conquer principle and fully considers the duality feature of network selection. The experiment is conducted by simulation through C and Lingo. And it shows that compared with a centralized random allocation scheme and a cost greed allocation scheme, the proposed scheme has better per- formance of reducing the total costs by 13.0% and 30.6% respectively for the user. 展开更多
关键词 视频监控系统 网络环境 优化调度 资源分配 无线网络 个人用户 成本估计 分配方案
下载PDF
Intelligent Mobile Video Surveillance System with Multilevel Distillation
5
作者 Yuan-Kai Wang Hung-Yu Chen 《Journal of Electronic Science and Technology》 CAS CSCD 2017年第2期133-140,共8页
This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveill... This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveillance videos on demand through video streaming over mobile communication networks. The intelligent video analysis includes moving object detection/tracking and key frame selection which can browse useful video clips. The communication networking services, comprising video transcoding, multimedia messaging, and mobile video streaming, transmit surveillance information into mobile appliances. Moving object detection is achieved by background subtraction and particle filter tracking. Key frame selection, which aims to deliver an alarm to a mobile client using multimedia messaging service accompanied with an extracted clear frame, is reached by devising a weighted importance criterion considering object clarity and face appearance. Besides, a spatial- domain cascaded transcoder is developed to convert the filtered image sequence of detected objects into the mobile video streaming format. Experimental results show that the system can successfully detect all events of moving objects for a complex surveillance scene, choose very appropriate key frames for users, and transcode the images with a high power signal-to-noise ratio (PSNR). 展开更多
关键词 Index Terms---Mobile video streaming moving object detection key frame extraction video surveillance video transcoding.
下载PDF
Human Detection for Video Surveillance in Hospital
6
作者 Cheng-Hung Chuang Zhen-You Lian +1 位作者 Po-Ren Teng Miao-Jen Lin 《Journal of Electronic Science and Technology》 CAS CSCD 2017年第2期147-152,共6页
This paper presents a human detection system in a vision-based hospital surveillance environment. The system is composed of three subsystems, i.e. background segmentation subsystem (BSS), human feature extraction su... This paper presents a human detection system in a vision-based hospital surveillance environment. The system is composed of three subsystems, i.e. background segmentation subsystem (BSS), human feature extraction subsystem (HFES), and human recognition subsystem (HRS). The codebook background model is applied in the BSS, the histogram of oriented gradients (HOG) features are used in the HFES, and the support vector machine (SVM) classification is employed in the HRS. By means of the integration of these subsystems, the human detection in a vision-based hospital surveillance environment is performed. Experimental results show that the proposed system can effectively detect most of the people in hospital surveillance video sequences. 展开更多
关键词 Index Terms--Background segmentation CODEBOOK histogram of oriented gradients (HOG) human classification support vector machine (SVM) video surveillance.
下载PDF
Visibility Enhancement of Scene Images Degraded by Foggy Weather Condition: An Application to Video Surveillance
7
作者 Ghulfam Zahra Muhammad Imran +4 位作者 Abdulrahman M.Qahtani Abdulmajeed Alsufyani Omar Almutiry Awais Mahmood Fayez Eid Alazemi 《Computers, Materials & Continua》 SCIE EI 2021年第9期3465-3481,共17页
:In recent years,video surveillance application played a significant role in our daily lives.Images taken during foggy and haze weather conditions for video surveillance application lose their authenticity and hence r... :In recent years,video surveillance application played a significant role in our daily lives.Images taken during foggy and haze weather conditions for video surveillance application lose their authenticity and hence reduces the visibility.The reason behind visibility enhancement of foggy and haze images is to help numerous computer and machine vision applications such as satellite imagery,object detection,target killing,and surveillance.To remove fog and enhance visibility,a number of visibility enhancement algorithms and methods have been proposed in the past.However,these techniques suffer from several limitations that place strong obstacles to the real world outdoor computer vision applications.The existing techniques do not perform well when images contain heavy fog,large white region and strong atmospheric light.This research work proposed a new framework to defog and dehaze the image in order to enhance the visibility of foggy and haze images.The proposed framework is based on a Conditional generative adversarial network(CGAN)with two networks;generator and discriminator,each having distinct properties.The generator network generates fog-free images from foggy images and discriminator network distinguishes between the restored image and the original fog-free image.Experiments are conducted on FRIDA dataset and haze images.To assess the performance of the proposed method on fog dataset,we use PSNR and SSIM,and for Haze dataset use e,r−,andσas performance metrics.Experimental results shows that the proposed method achieved higher values of PSNR and SSIM which is 18.23,0.823 and lower values produced by the compared method which are 13.94,0.791 and so on.Experimental results demonstrated that the proposed framework Has removed fog and enhanced the visibility of foggy and hazy images. 展开更多
关键词 video surveillance degraded images image restoration transmission map visibility enhancement
下载PDF
Smart Deep Learning Based Human Behaviour Classification for Video Surveillance
8
作者 Esam A.Al.Qaralleh Fahad Aldhaban +2 位作者 Halah Nasseif Malek Z.Alksasbeh Bassam A.Y.Alqaralleh 《Computers, Materials & Continua》 SCIE EI 2022年第9期5593-5605,共13页
Real-time video surveillance system is commonly employed to aid security professionals in preventing crimes.The use of deep learning(DL)technologies has transformed real-time video surveillance into smart video survei... Real-time video surveillance system is commonly employed to aid security professionals in preventing crimes.The use of deep learning(DL)technologies has transformed real-time video surveillance into smart video surveillance systems that automate human behavior classification.The recognition of events in the surveillance videos is considered a hot research topic in the field of computer science and it is gaining significant attention.Human action recognition(HAR)is treated as a crucial issue in several applications areas and smart video surveillance to improve the security level.The advancements of the DL models help to accomplish improved recognition performance.In this view,this paper presents a smart deep-based human behavior classification(SDL-HBC)model for real-time video surveillance.The proposed SDL-HBC model majorly aims to employ an adaptive median filtering(AMF)based pre-processing to reduce the noise content.Also,the capsule network(CapsNet)model is utilized for the extraction of feature vectors and the hyperparameter tuning of the CapsNet model takes place utilizing the Adam optimizer.Finally,the differential evolution(DE)with stacked autoencoder(SAE)model is applied for the classification of human activities in the intelligent video surveillance system.The performance validation of the SDL-HBC technique takes place using two benchmark datasets such as the KTH dataset.The experimental outcomes reported the enhanced recognition performance of the SDL-HBC technique over the recent state of art approaches with maximum accuracy of 0.9922. 展开更多
关键词 Human action recognition video surveillance intelligent systems deep learning SECURITY image classification
下载PDF
Detection of Objects in Motion—A Survey of Video Surveillance
9
作者 Jamal Raiyn 《Advances in Internet of Things》 2013年第4期73-78,共6页
Video surveillance system is the most important issue in homeland security field. It is used as a security system because of its ability to track and to detect a particular person. To overcome the lack of the conventi... Video surveillance system is the most important issue in homeland security field. It is used as a security system because of its ability to track and to detect a particular person. To overcome the lack of the conventional video surveillance system that is based on human perception, we introduce a novel cognitive video surveillance system (CVS) that is based on mobile agents. CVS offers important attributes such as suspect objects detection and smart camera cooperation for people tracking. According to many studies, an agent-based approach is appropriate for distributed systems, since mobile agents can transfer copies of themselves to other servers in the system. 展开更多
关键词 video surveillance OBJECT DETECTION Image Analysis
下载PDF
Framework for distributed video surveillance in heterogeneous environment
10
作者 FU Xiang ZENG Jie-xian NIE Yun-feng 《通讯和计算机(中英文版)》 2009年第2期25-28,共4页
关键词 有线网络 视频监控 移动电话 信号 通信技术
下载PDF
Semi-automatic Video Annotation Tool to Generate Ground Truth for Intelligent Video Surveillance Systems
11
作者 Ryu-Hyeok Gwon Jin-Tak Park Hakil Kim Yoo-Sung Kim 《Journal of Electrical Engineering》 2014年第4期160-168,共9页
关键词 电气控制 控制理论 电气测量 集中参数
下载PDF
Review for Anomaly Detection in Video Surveillance System Based on Deep Learning
12
作者 Yuchang Si 《IJLAI Transactions on Science and Engineering》 2024年第1期63-72,共10页
In this paper,abnormal target detection and location in video surveillance system are studied.In recent years,with the rapid development of network information technology,video surveillance technology has been widely ... In this paper,abnormal target detection and location in video surveillance system are studied.In recent years,with the rapid development of network information technology,video surveillance technology has been widely used,artificial anomaly detection methods have no way to meet the effective growth of video surveillance data,with 3D technology,face recognition technology,etc.,also promote the development of the field of computer vision,for the rapid analysis of a large number of video data to provide effective support.At present,abnormal target detection methods in video surveillance system mainly include the following two methods:One is to extract two-dimensional data features from video surveillance data,and effectively express video targets according to the extracted features.The information expressed mainly includes time information and spatial information.The second is to directly learn 3D space-time features for the module with motion information to detect the location of the abnormal target.Finally,the paper summarizes the full text and looks forward to the future development direction of video anomaly detection from three aspects:data set,method and evaluation index. 展开更多
关键词 Anomaly detection video surveillance Deep learning
原文传递
An Efficient Attention-Based Strategy for Anomaly Detection in Surveillance Video
13
作者 Sareer Ul Amin Yongjun Kim +2 位作者 Irfan Sami Sangoh Park Sanghyun Seo 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3939-3958,共20页
In the present technological world,surveillance cameras generate an immense amount of video data from various sources,making its scrutiny tough for computer vision specialists.It is difficult to search for anomalous e... In the present technological world,surveillance cameras generate an immense amount of video data from various sources,making its scrutiny tough for computer vision specialists.It is difficult to search for anomalous events manually in thesemassive video records since they happen infrequently and with a low probability in real-world monitoring systems.Therefore,intelligent surveillance is a requirement of the modern day,as it enables the automatic identification of normal and aberrant behavior using artificial intelligence and computer vision technologies.In this article,we introduce an efficient Attention-based deep-learning approach for anomaly detection in surveillance video(ADSV).At the input of the ADSV,a shots boundary detection technique is used to segment prominent frames.Next,The Lightweight ConvolutionNeuralNetwork(LWCNN)model receives the segmented frames to extract spatial and temporal information from the intermediate layer.Following that,spatial and temporal features are learned using Long Short-Term Memory(LSTM)cells and Attention Network from a series of frames for each anomalous activity in a sample.To detect motion and action,the LWCNN received chronologically sorted frames.Finally,the anomaly activity in the video is identified using the proposed trained ADSV model.Extensive experiments are conducted on complex and challenging benchmark datasets.In addition,the experimental results have been compared to state-ofthe-artmethodologies,and a significant improvement is attained,demonstrating the efficiency of our ADSV method. 展开更多
关键词 Attention-based anomaly detection video shots segmentation video surveillance computer vision deep learning smart surveillance system violence detection attention model
下载PDF
Vision-based fatigue crack detection using global motion compensation and video feature tracking
14
作者 Rushil Mojidra Jian Li +3 位作者 Ali Mohammadkhorasani Fernando Moreu Caroline Bennett William Collins 《Earthquake Engineering and Engineering Vibration》 SCIE EI CSCD 2023年第1期19-39,共21页
Fatigue cracks that develop in civil infrastructure such as steel bridges due to repetitive loads pose a major threat to structural integrity.Despite being the most common practice for fatigue crack detection,human vi... Fatigue cracks that develop in civil infrastructure such as steel bridges due to repetitive loads pose a major threat to structural integrity.Despite being the most common practice for fatigue crack detection,human visual inspection is known to be labor intensive,time-consuming,and prone to error.In this study,a computer vision-based fatigue crack detection approach using a short video recorded under live loads by a moving consumer-grade camera is presented.The method detects fatigue crack by tracking surface motion and identifies the differential motion pattern caused by opening and closing of the fatigue crack.However,the global motion introduced by a moving camera in the recorded video is typically far greater than the actual motion associated with fatigue crack opening/closing,leading to false detection results.To overcome the challenge,global motion compensation(GMC)techniques are introduced to compensate for camera-induced movement.In particular,hierarchical model-based motion estimation is adopted for 2D videos with simple geometry and a new method is developed by extending the bundled camera paths approach for 3D videos with complex geometry.The proposed methodology is validated using two laboratory test setups for both in-plane and out-of-plane fatigue cracks.The results confirm the importance of motion compensation for both 2D and 3D videos and demonstrate the effectiveness of the proposed GMC methods as well as the subsequent crack detection algorithm. 展开更多
关键词 global motion compensation fatigue crack detection computer vision parallax effect distortion induced fatigue crack video stabilization camera motion in-plane fatigue crack out-of-plane fatigue crackanalysis
下载PDF
A Personalized Video Synopsis Framework for Spherical Surveillance Video
15
作者 S.Priyadharshini Ansuman Mahapatra 《Computer Systems Science & Engineering》 SCIE EI 2023年第6期2603-2616,共14页
Video synopsis is an effective way to easily summarize long-recorded surveillance videos.The omnidirectional view allows the observer to select the desired fields of view(FoV)from the different FoVavailable for spheri... Video synopsis is an effective way to easily summarize long-recorded surveillance videos.The omnidirectional view allows the observer to select the desired fields of view(FoV)from the different FoVavailable for spherical surveillance video.By choosing to watch one portion,the observer misses out on the events occurring somewhere else in the spherical scene.This causes the observer to experience fear of missing out(FOMO).Hence,a novel personalized video synopsis approach for the generation of non-spherical videos has been introduced to address this issue.It also includes an action recognition module that makes it easy to display necessary actions by prioritizing them.This work minimizes and maximizes multiple goals such as loss of activity,collision,temporal consistency,length,show,and important action cost respectively.The performance of the proposed framework is evaluated through extensive simulation and compared with the state-of-art video synopsis optimization algorithms.Experimental results suggest that some constraints are better optimized by using the latest metaheuristic optimization algorithms to generate compact personalized synopsis videos from spherical surveillance videos. 展开更多
关键词 Immersive video non-spherical video synopsis spherical video panoramic surveillance video 360°video
下载PDF
Quantum Computing Based Neural Networks for Anomaly Classification in Real-Time Surveillance Videos
16
作者 MD.Yasar Arafath A.Niranjil Kumar 《Computer Systems Science & Engineering》 SCIE EI 2023年第8期2489-2508,共20页
For intelligent surveillance videos,anomaly detection is extremely important.Deep learning algorithms have been popular for evaluating realtime surveillance recordings,like traffic accidents,and criminal or unlawful i... For intelligent surveillance videos,anomaly detection is extremely important.Deep learning algorithms have been popular for evaluating realtime surveillance recordings,like traffic accidents,and criminal or unlawful incidents such as suicide attempts.Nevertheless,Deep learning methods for classification,like convolutional neural networks,necessitate a lot of computing power.Quantum computing is a branch of technology that solves abnormal and complex problems using quantum mechanics.As a result,the focus of this research is on developing a hybrid quantum computing model which is based on deep learning.This research develops a Quantum Computing-based Convolutional Neural Network(QC-CNN)to extract features and classify anomalies from surveillance footage.A Quantum-based Circuit,such as the real amplitude circuit,is utilized to improve the performance of the model.As far as my research,this is the first work to employ quantum deep learning techniques to classify anomalous events in video surveillance applications.There are 13 anomalies classified from the UCF-crime dataset.Based on experimental results,the proposed model is capable of efficiently classifying data concerning confusion matrix,Receiver Operating Characteristic(ROC),accuracy,Area Under Curve(AUC),precision,recall as well as F1-score.The proposed QC-CNN has attained the best accuracy of 95.65 percent which is 5.37%greater when compared to other existing models.To measure the efficiency of the proposed work,QC-CNN is also evaluated with classical and quantum models. 展开更多
关键词 Deep learning video surveillance quantum computing anomaly detection convolutional neural network
下载PDF
Automatic Real-Time Detection of Infant Drowning Using YOLOv5 and Faster R-CNN Models Based on Video Surveillance
17
作者 Qianen He Zhiqiang Mei +1 位作者 Huisheng Zhang Xiuying Xu 《Journal of Social Computing》 EI 2023年第1期62-73,共12页
Infant drowning has occurred frequently in swimming pools recent years,which motivates the research on automatic real-time detection of the accident.Unlike youths or adults,swimming infants are small in terms of size ... Infant drowning has occurred frequently in swimming pools recent years,which motivates the research on automatic real-time detection of the accident.Unlike youths or adults,swimming infants are small in terms of size and motion range,and unable to send out distress signals in emergencies,which exerts negative effects on the detection of drowning.Aiming at this problem,a new step is initialized towards detecting infant drowning automatically and efficiently based on video surveillance.Diverse live-scene videos of infant swimming and drowning are collected from a variety of natatoriums and labeled as datasets.A part of the datasets is downscaled or enlarged to enhance generalization ability of the model.On this basis,advantages of Faster R-CNN and a series of YOLOv5 models are specifically explored to enable fast and accurate detection of infant drowning in real-world.Supervised learning experiments are carried out,model test results show that mean Average Precision(mAP)of either Faster R-CNN or YOLOv5s of the series of YOLOv5 can be over 89%;the former can process merely 6 frames of videos per second with the precision of only 62.04%,while the latter can reach an average speed of 75 frames/s with the precision of about 86.6%.The YOLOv5s eventually stands out as an optimal model for detecting infant drowning in view of comprehensive performance,which is of great application value to reduce the accidents in swimming pools. 展开更多
关键词 infant drowning detection YOLOv5 Faster R-CNN video surveillance supervised learning
原文传递
TEAM:Transformer Encoder Attention Module for Video Classification
18
作者 Hae Sung Park Yong Suk Choi 《Computer Systems Science & Engineering》 2024年第2期451-477,共27页
Much like humans focus solely on object movement to understand actions,directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension.In the recent study,V... Much like humans focus solely on object movement to understand actions,directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension.In the recent study,Video Masked Auto-Encoder(VideoMAE)employs a pre-training approach with a high ratio of tube masking and reconstruction,effectively mitigating spatial bias due to temporal redundancy in full video frames.This steers the model’s focus toward detailed temporal contexts.However,as the VideoMAE still relies on full video frames during the action recognition stage,it may exhibit a progressive shift in attention towards spatial contexts,deteriorating its ability to capture the main spatio-temporal contexts.To address this issue,we propose an attention-directing module named Transformer Encoder Attention Module(TEAM).This proposed module effectively directs the model’s attention to the core characteristics within each video,inherently mitigating spatial bias.The TEAM first figures out the core features among the overall extracted features from each video.After that,it discerns the specific parts of the video where those features are located,encouraging the model to focus more on these informative parts.Consequently,during the action recognition stage,the proposed TEAM effectively shifts the VideoMAE’s attention from spatial contexts towards the core spatio-temporal contexts.This attention-shift manner alleviates the spatial bias in the model and simultaneously enhances its ability to capture precise video contexts.We conduct extensive experiments to explore the optimal configuration that enables the TEAM to fulfill its intended design purpose and facilitates its seamless integration with the VideoMAE framework.The integrated model,i.e.,VideoMAE+TEAM,outperforms the existing VideoMAE by a significant margin on Something-Something-V2(71.3%vs.70.3%).Moreover,the qualitative comparisons demonstrate that the TEAM encourages the model to disregard insignificant features and focus more on the essential video features,capturing more detailed spatio-temporal contexts within the video. 展开更多
关键词 video classification action recognition vision transformer masked auto-encoder
下载PDF
SwinVid:Enhancing Video Object Detection Using Swin Transformer
19
作者 Abdelrahman Maharek Amr Abozeid +1 位作者 Rasha Orban Kamal ElDahshan 《Computer Systems Science & Engineering》 2024年第2期305-320,共16页
What causes object detection in video to be less accurate than it is in still images?Because some video frames have degraded in appearance from fast movement,out-of-focus camera shots,and changes in posture.These reas... What causes object detection in video to be less accurate than it is in still images?Because some video frames have degraded in appearance from fast movement,out-of-focus camera shots,and changes in posture.These reasons have made video object detection(VID)a growing area of research in recent years.Video object detection can be used for various healthcare applications,such as detecting and tracking tumors in medical imaging,monitoring the movement of patients in hospitals and long-term care facilities,and analyzing videos of surgeries to improve technique and training.Additionally,it can be used in telemedicine to help diagnose and monitor patients remotely.Existing VID techniques are based on recurrent neural networks or optical flow for feature aggregation to produce reliable features which can be used for detection.Some of those methods aggregate features on the full-sequence level or from nearby frames.To create feature maps,existing VID techniques frequently use Convolutional Neural Networks(CNNs)as the backbone network.On the other hand,Vision Transformers have outperformed CNNs in various vision tasks,including object detection in still images and image classification.We propose in this research to use Swin-Transformer,a state-of-the-art Vision Transformer,as an alternative to CNN-based backbone networks for object detection in videos.The proposed architecture enhances the accuracy of existing VID methods.The ImageNet VID and EPIC KITCHENS datasets are used to evaluate the suggested methodology.We have demonstrated that our proposed method is efficient by achieving 84.3%mean average precision(mAP)on ImageNet VID using less memory in comparison to other leading VID techniques.The source code is available on the website https://github.com/amaharek/SwinVid. 展开更多
关键词 video object detection vision transformers convolutional neural networks deep learning
下载PDF
Automated Video-Based Face Detection Using Harris Hawks Optimization with Deep Learning
20
作者 Latifah Almuqren Manar Ahmed Hamza +1 位作者 Abdullah Mohamed Amgad Atta Abdelmageed 《Computers, Materials & Continua》 SCIE EI 2023年第6期4917-4933,共17页
Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments... Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments in deep learning(DL)and computer vision(CV)techniques enable the design of automated face recognition and tracking methods.This study presents a novel Harris Hawks Optimization with deep learning-empowered automated face detection and tracking(HHODL-AFDT)method.The proposed HHODL-AFDT model involves a Faster region based convolution neural network(RCNN)-based face detection model and HHO-based hyperparameter opti-mization process.The presented optimal Faster RCNN model precisely rec-ognizes the face and is passed into the face-tracking model using a regression network(REGN).The face tracking using the REGN model uses the fea-tures from neighboring frames and foresees the location of the target face in succeeding frames.The application of the HHO algorithm for optimal hyperparameter selection shows the novelty of the work.The experimental validation of the presented HHODL-AFDT algorithm is conducted using two datasets and the experiment outcomes highlighted the superior performance of the HHODL-AFDT model over current methodologies with maximum accuracy of 90.60%and 88.08%under PICS and VTB datasets,respectively. 展开更多
关键词 Face detection face tracking deep learning computer vision video surveillance parameter tuning
下载PDF
上一页 1 2 181 下一页 到第
使用帮助 返回顶部