In the environment of smart examination rooms, it is important to quickly and accurately detect abnormal behavior(human standing) for the construction of a smart campus. Based on deep learning, we propose an intellige...In the environment of smart examination rooms, it is important to quickly and accurately detect abnormal behavior(human standing) for the construction of a smart campus. Based on deep learning, we propose an intelligentstanding human detection (ISHD) method based on an improved single shot multibox detector to detect thetarget of standing human posture in the scene frame of exam room video surveillance at a specific examinationstage. ISHD combines the MobileNet network in a single shot multibox detector network, improves the posturefeature extractor of a standing person, merges prior knowledge, and introduces transfer learning in the trainingstrategy, which greatly reduces the computation amount, improves the detection accuracy, and reduces the trainingdifficulty. The experiment proves that the model proposed in this paper has a better detection ability for the smalland medium-sized standing human body posture in video test scenes on the EMV-2 dataset.展开更多
In recent years,the number of Gun-related incidents has crossed over 250,000 per year and over 85%of the existing 1 billion firearms are in civilian hands,manual monitoring has not proven effective in detecting firear...In recent years,the number of Gun-related incidents has crossed over 250,000 per year and over 85%of the existing 1 billion firearms are in civilian hands,manual monitoring has not proven effective in detecting firearms.which is why an automated weapon detection system is needed.Various automated convolutional neural networks(CNN)weapon detection systems have been proposed in the past to generate good results.However,These techniques have high computation overhead and are slow to provide real-time detection which is essential for the weapon detection system.These models have a high rate of false negatives because they often fail to detect the guns due to the low quality and visibility issues of surveillance videos.This research work aims to minimize the rate of false negatives and false positives in weapon detection while keeping the speed of detection as a key parameter.The proposed framework is based on You Only Look Once(YOLO)and Area of Interest(AOI).Initially,themodels take pre-processed frames where the background is removed by the use of the Gaussian blur algorithm.The proposed architecture will be assessed through various performance parameters such as False Negative,False Positive,precision,recall rate,and F1 score.The results of this research work make it clear that due to YOLO-v5s high recall rate and speed of detection are achieved.Speed reached 0.010 s per frame compared to the 0.17 s of the Faster R-CNN.It is promising to be used in the field of security and weapon detection.展开更多
Resource allocation is an important problem in ubiquitous network. Most of the existing resource allocation methods considering only wireless networks are not suitable for the ubiquitous network environment, and they ...Resource allocation is an important problem in ubiquitous network. Most of the existing resource allocation methods considering only wireless networks are not suitable for the ubiquitous network environment, and they will harm the interest of individual users with instable resource requirements. This paper considers the multi-point video surveillance scenarios in a complex network environment with both wired and wireless networks. We introduce the utility estimated by the total costs of an individual network user. The problem is studied through mathematical modeling and we propose an improved problem-specific branch-and-cut algorithm to solve it. The algorithm follows the divide-and-conquer principle and fully considers the duality feature of network selection. The experiment is conducted by simulation through C and Lingo. And it shows that compared with a centralized random allocation scheme and a cost greed allocation scheme, the proposed scheme has better per- formance of reducing the total costs by 13.0% and 30.6% respectively for the user.展开更多
Due to the increasing demand for developing a secure and smart living environment, the intelligent video surveillance technology has attracted considerable attention. Building an automatic, reliable, secure, and intel...Due to the increasing demand for developing a secure and smart living environment, the intelligent video surveillance technology has attracted considerable attention. Building an automatic, reliable, secure, and intelligent video surveillance system has spawned large research projects and triggered many popular research topics in several international conferences and workshops recently. This special issue of Journal of ElecWonic Science and Technology (JEST) aims to present recent advances in video surveillance systems which address the observation of people in an environment, leading to a real-time description of their actions and interactions.展开更多
Real-time video surveillance system is commonly employed to aid security professionals in preventing crimes.The use of deep learning(DL)technologies has transformed real-time video surveillance into smart video survei...Real-time video surveillance system is commonly employed to aid security professionals in preventing crimes.The use of deep learning(DL)technologies has transformed real-time video surveillance into smart video surveillance systems that automate human behavior classification.The recognition of events in the surveillance videos is considered a hot research topic in the field of computer science and it is gaining significant attention.Human action recognition(HAR)is treated as a crucial issue in several applications areas and smart video surveillance to improve the security level.The advancements of the DL models help to accomplish improved recognition performance.In this view,this paper presents a smart deep-based human behavior classification(SDL-HBC)model for real-time video surveillance.The proposed SDL-HBC model majorly aims to employ an adaptive median filtering(AMF)based pre-processing to reduce the noise content.Also,the capsule network(CapsNet)model is utilized for the extraction of feature vectors and the hyperparameter tuning of the CapsNet model takes place utilizing the Adam optimizer.Finally,the differential evolution(DE)with stacked autoencoder(SAE)model is applied for the classification of human activities in the intelligent video surveillance system.The performance validation of the SDL-HBC technique takes place using two benchmark datasets such as the KTH dataset.The experimental outcomes reported the enhanced recognition performance of the SDL-HBC technique over the recent state of art approaches with maximum accuracy of 0.9922.展开更多
This paper presents a human detection system in a vision-based hospital surveillance environment. The system is composed of three subsystems, i.e. background segmentation subsystem (BSS), human feature extraction su...This paper presents a human detection system in a vision-based hospital surveillance environment. The system is composed of three subsystems, i.e. background segmentation subsystem (BSS), human feature extraction subsystem (HFES), and human recognition subsystem (HRS). The codebook background model is applied in the BSS, the histogram of oriented gradients (HOG) features are used in the HFES, and the support vector machine (SVM) classification is employed in the HRS. By means of the integration of these subsystems, the human detection in a vision-based hospital surveillance environment is performed. Experimental results show that the proposed system can effectively detect most of the people in hospital surveillance video sequences.展开更多
This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveill...This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveillance videos on demand through video streaming over mobile communication networks. The intelligent video analysis includes moving object detection/tracking and key frame selection which can browse useful video clips. The communication networking services, comprising video transcoding, multimedia messaging, and mobile video streaming, transmit surveillance information into mobile appliances. Moving object detection is achieved by background subtraction and particle filter tracking. Key frame selection, which aims to deliver an alarm to a mobile client using multimedia messaging service accompanied with an extracted clear frame, is reached by devising a weighted importance criterion considering object clarity and face appearance. Besides, a spatial- domain cascaded transcoder is developed to convert the filtered image sequence of detected objects into the mobile video streaming format. Experimental results show that the system can successfully detect all events of moving objects for a complex surveillance scene, choose very appropriate key frames for users, and transcode the images with a high power signal-to-noise ratio (PSNR).展开更多
:In recent years,video surveillance application played a significant role in our daily lives.Images taken during foggy and haze weather conditions for video surveillance application lose their authenticity and hence r...:In recent years,video surveillance application played a significant role in our daily lives.Images taken during foggy and haze weather conditions for video surveillance application lose their authenticity and hence reduces the visibility.The reason behind visibility enhancement of foggy and haze images is to help numerous computer and machine vision applications such as satellite imagery,object detection,target killing,and surveillance.To remove fog and enhance visibility,a number of visibility enhancement algorithms and methods have been proposed in the past.However,these techniques suffer from several limitations that place strong obstacles to the real world outdoor computer vision applications.The existing techniques do not perform well when images contain heavy fog,large white region and strong atmospheric light.This research work proposed a new framework to defog and dehaze the image in order to enhance the visibility of foggy and haze images.The proposed framework is based on a Conditional generative adversarial network(CGAN)with two networks;generator and discriminator,each having distinct properties.The generator network generates fog-free images from foggy images and discriminator network distinguishes between the restored image and the original fog-free image.Experiments are conducted on FRIDA dataset and haze images.To assess the performance of the proposed method on fog dataset,we use PSNR and SSIM,and for Haze dataset use e,r−,andσas performance metrics.Experimental results shows that the proposed method achieved higher values of PSNR and SSIM which is 18.23,0.823 and lower values produced by the compared method which are 13.94,0.791 and so on.Experimental results demonstrated that the proposed framework Has removed fog and enhanced the visibility of foggy and hazy images.展开更多
Generating ground truth data for developing object detection algorithms of intelligent surveillance systems is a considerably important yet time-consuming task; therefore, a user-friendly tool to annotate videos effic...Generating ground truth data for developing object detection algorithms of intelligent surveillance systems is a considerably important yet time-consuming task; therefore, a user-friendly tool to annotate videos efficiently and accurately is required. In this paper, the development of a semi-automatic video annotation tool is described. For efficiency, the developed tool can automatically generate the initial annotation data for the input videos utilizing automatic object detection modules, which are developed independently and registered in the tool. To guarantee the accuracy of the ground truth data, the system also has several user-friendly functions to help users check and edit the initial annotation data generated by the automatic object detection modules. According to the experiment's results, employing the developed annotation tool is considerably beneficial for reducing annotation time; when compared to manual annotation schemes, using the tool resulted in an annotation time reduction of up to 2.3 times.展开更多
In this paper,abnormal target detection and location in video surveillance system are studied.In recent years,with the rapid development of network information technology,video surveillance technology has been widely ...In this paper,abnormal target detection and location in video surveillance system are studied.In recent years,with the rapid development of network information technology,video surveillance technology has been widely used,artificial anomaly detection methods have no way to meet the effective growth of video surveillance data,with 3D technology,face recognition technology,etc.,also promote the development of the field of computer vision,for the rapid analysis of a large number of video data to provide effective support.At present,abnormal target detection methods in video surveillance system mainly include the following two methods:One is to extract two-dimensional data features from video surveillance data,and effectively express video targets according to the extracted features.The information expressed mainly includes time information and spatial information.The second is to directly learn 3D space-time features for the module with motion information to detect the location of the abnormal target.Finally,the paper summarizes the full text and looks forward to the future development direction of video anomaly detection from three aspects:data set,method and evaluation index.展开更多
In the present technological world,surveillance cameras generate an immense amount of video data from various sources,making its scrutiny tough for computer vision specialists.It is difficult to search for anomalous e...In the present technological world,surveillance cameras generate an immense amount of video data from various sources,making its scrutiny tough for computer vision specialists.It is difficult to search for anomalous events manually in thesemassive video records since they happen infrequently and with a low probability in real-world monitoring systems.Therefore,intelligent surveillance is a requirement of the modern day,as it enables the automatic identification of normal and aberrant behavior using artificial intelligence and computer vision technologies.In this article,we introduce an efficient Attention-based deep-learning approach for anomaly detection in surveillance video(ADSV).At the input of the ADSV,a shots boundary detection technique is used to segment prominent frames.Next,The Lightweight ConvolutionNeuralNetwork(LWCNN)model receives the segmented frames to extract spatial and temporal information from the intermediate layer.Following that,spatial and temporal features are learned using Long Short-Term Memory(LSTM)cells and Attention Network from a series of frames for each anomalous activity in a sample.To detect motion and action,the LWCNN received chronologically sorted frames.Finally,the anomaly activity in the video is identified using the proposed trained ADSV model.Extensive experiments are conducted on complex and challenging benchmark datasets.In addition,the experimental results have been compared to state-ofthe-artmethodologies,and a significant improvement is attained,demonstrating the efficiency of our ADSV method.展开更多
For intelligent surveillance videos,anomaly detection is extremely important.Deep learning algorithms have been popular for evaluating realtime surveillance recordings,like traffic accidents,and criminal or unlawful i...For intelligent surveillance videos,anomaly detection is extremely important.Deep learning algorithms have been popular for evaluating realtime surveillance recordings,like traffic accidents,and criminal or unlawful incidents such as suicide attempts.Nevertheless,Deep learning methods for classification,like convolutional neural networks,necessitate a lot of computing power.Quantum computing is a branch of technology that solves abnormal and complex problems using quantum mechanics.As a result,the focus of this research is on developing a hybrid quantum computing model which is based on deep learning.This research develops a Quantum Computing-based Convolutional Neural Network(QC-CNN)to extract features and classify anomalies from surveillance footage.A Quantum-based Circuit,such as the real amplitude circuit,is utilized to improve the performance of the model.As far as my research,this is the first work to employ quantum deep learning techniques to classify anomalous events in video surveillance applications.There are 13 anomalies classified from the UCF-crime dataset.Based on experimental results,the proposed model is capable of efficiently classifying data concerning confusion matrix,Receiver Operating Characteristic(ROC),accuracy,Area Under Curve(AUC),precision,recall as well as F1-score.The proposed QC-CNN has attained the best accuracy of 95.65 percent which is 5.37%greater when compared to other existing models.To measure the efficiency of the proposed work,QC-CNN is also evaluated with classical and quantum models.展开更多
Infant drowning has occurred frequently in swimming pools recent years,which motivates the research on automatic real-time detection of the accident.Unlike youths or adults,swimming infants are small in terms of size ...Infant drowning has occurred frequently in swimming pools recent years,which motivates the research on automatic real-time detection of the accident.Unlike youths or adults,swimming infants are small in terms of size and motion range,and unable to send out distress signals in emergencies,which exerts negative effects on the detection of drowning.Aiming at this problem,a new step is initialized towards detecting infant drowning automatically and efficiently based on video surveillance.Diverse live-scene videos of infant swimming and drowning are collected from a variety of natatoriums and labeled as datasets.A part of the datasets is downscaled or enlarged to enhance generalization ability of the model.On this basis,advantages of Faster R-CNN and a series of YOLOv5 models are specifically explored to enable fast and accurate detection of infant drowning in real-world.Supervised learning experiments are carried out,model test results show that mean Average Precision(mAP)of either Faster R-CNN or YOLOv5s of the series of YOLOv5 can be over 89%;the former can process merely 6 frames of videos per second with the precision of only 62.04%,while the latter can reach an average speed of 75 frames/s with the precision of about 86.6%.The YOLOv5s eventually stands out as an optimal model for detecting infant drowning in view of comprehensive performance,which is of great application value to reduce the accidents in swimming pools.展开更多
Video synopsis is an effective way to easily summarize long-recorded surveillance videos.The omnidirectional view allows the observer to select the desired fields of view(FoV)from the different FoVavailable for spheri...Video synopsis is an effective way to easily summarize long-recorded surveillance videos.The omnidirectional view allows the observer to select the desired fields of view(FoV)from the different FoVavailable for spherical surveillance video.By choosing to watch one portion,the observer misses out on the events occurring somewhere else in the spherical scene.This causes the observer to experience fear of missing out(FOMO).Hence,a novel personalized video synopsis approach for the generation of non-spherical videos has been introduced to address this issue.It also includes an action recognition module that makes it easy to display necessary actions by prioritizing them.This work minimizes and maximizes multiple goals such as loss of activity,collision,temporal consistency,length,show,and important action cost respectively.The performance of the proposed framework is evaluated through extensive simulation and compared with the state-of-art video synopsis optimization algorithms.Experimental results suggest that some constraints are better optimized by using the latest metaheuristic optimization algorithms to generate compact personalized synopsis videos from spherical surveillance videos.展开更多
Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments...Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments in deep learning(DL)and computer vision(CV)techniques enable the design of automated face recognition and tracking methods.This study presents a novel Harris Hawks Optimization with deep learning-empowered automated face detection and tracking(HHODL-AFDT)method.The proposed HHODL-AFDT model involves a Faster region based convolution neural network(RCNN)-based face detection model and HHO-based hyperparameter opti-mization process.The presented optimal Faster RCNN model precisely rec-ognizes the face and is passed into the face-tracking model using a regression network(REGN).The face tracking using the REGN model uses the fea-tures from neighboring frames and foresees the location of the target face in succeeding frames.The application of the HHO algorithm for optimal hyperparameter selection shows the novelty of the work.The experimental validation of the presented HHODL-AFDT algorithm is conducted using two datasets and the experiment outcomes highlighted the superior performance of the HHODL-AFDT model over current methodologies with maximum accuracy of 90.60%and 88.08%under PICS and VTB datasets,respectively.展开更多
In this paper, we propose a video searching system that utilizes face recognition as searching indexing feature. As the applications of video cameras have great increase in recent years, face recognition makes a perfe...In this paper, we propose a video searching system that utilizes face recognition as searching indexing feature. As the applications of video cameras have great increase in recent years, face recognition makes a perfect fit for searching targeted individuals within the vast amount of video data. However, the performance of such searching depends on the quality of face images recorded in the video signals. Since the surveillance video cameras record videos without fixed postures for the object, face occlusion is very common in everyday video. The proposed system builds a model for occluded faces using fuzzy principal component analysis(FPCA), and reconstructs the human faces with the available information. Experimental results show that the system has very high efficiency in processing the real life videos, and it is very robust to various kinds of face occlusions. Hence it can relieve people reviewers from the front of the monitors and greatly enhances the efficiency as well. The proposed system has been installed and applied in various environments and has already demonstrated its power by helping solving real cases.展开更多
Background Video anomaly detection has always been a hot topic and has attracted increasing attention.Many of the existing methods for video anomaly detection depend on processing the entire video rather than consider...Background Video anomaly detection has always been a hot topic and has attracted increasing attention.Many of the existing methods for video anomaly detection depend on processing the entire video rather than considering only the significant context. Method This paper proposes a novel video anomaly detection method called COVAD that mainly focuses on the region of interest in the video instead of the entire video. Our proposed COVAD method is based on an autoencoded convolutional neural network and a coordinated attention mechanism,which can effectively capture meaningful objects in the video and dependencies among different objects. Relying on the existing memory-guided video frame prediction network, our algorithm can significantly predict the future motion and appearance of objects in a video more effectively. Result The proposed algorithm obtained better experimental results on multiple datasets and outperformed the baseline models considered in our analysis. Simultaneously, we provide an improved visual test that can provide pixel-level anomaly explanations.展开更多
Action recognition is an important topic in computer vision. Recently, deep learning technologies have been successfully used in lots of applications including video data for sloving recognition problems. However, mos...Action recognition is an important topic in computer vision. Recently, deep learning technologies have been successfully used in lots of applications including video data for sloving recognition problems. However, most existing deep learning based recognition frameworks are not optimized for action in the surveillance videos. In this paper, we propose a novel method to deal with the recognition of different types of actions in outdoor surveillance videos. The proposed method first introduces motion compensation to improve the detection of human target. Then, it uses three different types of deep models with single and sequenced images as inputs for the recognition of different types of actions. Finally, predictions from different models are fused with a linear model. Experimental results show that the proposed method works well on the real surveillance videos.展开更多
This paper presents an urban expressway video surveillance and monitoring system for traffic flow measurement and abnormal performance detection. The proposed flow detection module collects traffic flow statistics in ...This paper presents an urban expressway video surveillance and monitoring system for traffic flow measurement and abnormal performance detection. The proposed flow detection module collects traffic flow statistics in real time by leveraging multi-vehicle tracking information. Based on these online statistics, road operating situations can be easily obtained. Using spatiotemporal trajectories, vehicle motion paths are encoded by hidden Markov models. With path division and parameter matching, abnormal performances containing extra low or high speed driving, illegal stopping and turning are detected in real scenes. The traffic surveillance approach is implemented and evaluated on a DM642 DSP-based embedded platform. Experimental results demonstrate that the proposed system is feasible for the detection of vehicle speed, vehicle counts and road efficiency, and it is effective for the monitoring of the aforementioned anomalies with low computational costs.展开更多
Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It a...Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It allows the deployment of smart cameras or optical sensors with computer vision techniques,which may serve in several object detection and tracking tasks.These tasks have been considered challenging and high-level perceptual problems,frequently dominated by relative information about the environment,where main concerns such as occlusion,illumination,background,object deformation,and object class variations are commonplace.In order to show the importance of top view surveillance,a collaborative robotics framework has been presented.It can assist in the detection and tracking of multiple objects in top view surveillance.The framework consists of a smart robotic camera embedded with the visual processing unit.The existing pre-trained deep learning models named SSD and YOLO has been adopted for object detection and localization.The detection models are further combined with different tracking algorithms,including GOTURN,MEDIANFLOW,TLD,KCF,MIL,and BOOSTING.These algorithms,along with detection models,help to track and predict the trajectories of detected objects.The pre-trained models are employed;therefore,the generalization performance is also investigated through testing the models on various sequences of top view data set.The detection models achieved maximum True Detection Rate 93%to 90%with a maximum 0.6%False Detection Rate.The tracking results of different algorithms are nearly identical,with tracking accuracy ranging from 90%to 94%.Furthermore,a discussion has been carried out on output results along with future guidelines.展开更多
基金supported by the Natural Science Foundation of China 62102147National Science Foundation of Hunan Province 2022JJ30424,2022JJ50253,and 2022JJ30275+2 种基金Scientific Research Project of Hunan Provincial Department of Education 21B0616 and 21B0738Hunan University of Arts and Sciences Ph.D.Start-Up Project BSQD02,20BSQD13the Construct Program of Applied Characteristic Discipline in Hunan University of Science and Engineering.
文摘In the environment of smart examination rooms, it is important to quickly and accurately detect abnormal behavior(human standing) for the construction of a smart campus. Based on deep learning, we propose an intelligentstanding human detection (ISHD) method based on an improved single shot multibox detector to detect thetarget of standing human posture in the scene frame of exam room video surveillance at a specific examinationstage. ISHD combines the MobileNet network in a single shot multibox detector network, improves the posturefeature extractor of a standing person, merges prior knowledge, and introduces transfer learning in the trainingstrategy, which greatly reduces the computation amount, improves the detection accuracy, and reduces the trainingdifficulty. The experiment proves that the model proposed in this paper has a better detection ability for the smalland medium-sized standing human body posture in video test scenes on the EMV-2 dataset.
基金We deeply acknowledge Taif University for Supporting and funding this study through Taif University Researchers Supporting Project Number(TURSP-2020/115),Taif University,Taif,Saudi Arabia.
文摘In recent years,the number of Gun-related incidents has crossed over 250,000 per year and over 85%of the existing 1 billion firearms are in civilian hands,manual monitoring has not proven effective in detecting firearms.which is why an automated weapon detection system is needed.Various automated convolutional neural networks(CNN)weapon detection systems have been proposed in the past to generate good results.However,These techniques have high computation overhead and are slow to provide real-time detection which is essential for the weapon detection system.These models have a high rate of false negatives because they often fail to detect the guns due to the low quality and visibility issues of surveillance videos.This research work aims to minimize the rate of false negatives and false positives in weapon detection while keeping the speed of detection as a key parameter.The proposed framework is based on You Only Look Once(YOLO)and Area of Interest(AOI).Initially,themodels take pre-processed frames where the background is removed by the use of the Gaussian blur algorithm.The proposed architecture will be assessed through various performance parameters such as False Negative,False Positive,precision,recall rate,and F1 score.The results of this research work make it clear that due to YOLO-v5s high recall rate and speed of detection are achieved.Speed reached 0.010 s per frame compared to the 0.17 s of the Faster R-CNN.It is promising to be used in the field of security and weapon detection.
基金Supported by the National Science and Technology Major Project (No.2011ZX03005-004-04)the National Grand Fundamental Research 973 Program of China (No.2011CB302-905)+2 种基金the National Natural Science Foundation of China (No.61170058,61272133,and 51274202)the Research Fund for the Doctoral Program of Higher Education of China (No.20103402110041)the Suzhou Fundamental Research Project (No.SYG201143)
文摘Resource allocation is an important problem in ubiquitous network. Most of the existing resource allocation methods considering only wireless networks are not suitable for the ubiquitous network environment, and they will harm the interest of individual users with instable resource requirements. This paper considers the multi-point video surveillance scenarios in a complex network environment with both wired and wireless networks. We introduce the utility estimated by the total costs of an individual network user. The problem is studied through mathematical modeling and we propose an improved problem-specific branch-and-cut algorithm to solve it. The algorithm follows the divide-and-conquer principle and fully considers the duality feature of network selection. The experiment is conducted by simulation through C and Lingo. And it shows that compared with a centralized random allocation scheme and a cost greed allocation scheme, the proposed scheme has better per- formance of reducing the total costs by 13.0% and 30.6% respectively for the user.
文摘Due to the increasing demand for developing a secure and smart living environment, the intelligent video surveillance technology has attracted considerable attention. Building an automatic, reliable, secure, and intelligent video surveillance system has spawned large research projects and triggered many popular research topics in several international conferences and workshops recently. This special issue of Journal of ElecWonic Science and Technology (JEST) aims to present recent advances in video surveillance systems which address the observation of people in an environment, leading to a real-time description of their actions and interactions.
文摘Real-time video surveillance system is commonly employed to aid security professionals in preventing crimes.The use of deep learning(DL)technologies has transformed real-time video surveillance into smart video surveillance systems that automate human behavior classification.The recognition of events in the surveillance videos is considered a hot research topic in the field of computer science and it is gaining significant attention.Human action recognition(HAR)is treated as a crucial issue in several applications areas and smart video surveillance to improve the security level.The advancements of the DL models help to accomplish improved recognition performance.In this view,this paper presents a smart deep-based human behavior classification(SDL-HBC)model for real-time video surveillance.The proposed SDL-HBC model majorly aims to employ an adaptive median filtering(AMF)based pre-processing to reduce the noise content.Also,the capsule network(CapsNet)model is utilized for the extraction of feature vectors and the hyperparameter tuning of the CapsNet model takes place utilizing the Adam optimizer.Finally,the differential evolution(DE)with stacked autoencoder(SAE)model is applied for the classification of human activities in the intelligent video surveillance system.The performance validation of the SDL-HBC technique takes place using two benchmark datasets such as the KTH dataset.The experimental outcomes reported the enhanced recognition performance of the SDL-HBC technique over the recent state of art approaches with maximum accuracy of 0.9922.
基金supported by the“MOST”under Grant No.103-2221-E-468-008-MY2
文摘This paper presents a human detection system in a vision-based hospital surveillance environment. The system is composed of three subsystems, i.e. background segmentation subsystem (BSS), human feature extraction subsystem (HFES), and human recognition subsystem (HRS). The codebook background model is applied in the BSS, the histogram of oriented gradients (HOG) features are used in the HFES, and the support vector machine (SVM) classification is employed in the HRS. By means of the integration of these subsystems, the human detection in a vision-based hospital surveillance environment is performed. Experimental results show that the proposed system can effectively detect most of the people in hospital surveillance video sequences.
文摘This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveillance videos on demand through video streaming over mobile communication networks. The intelligent video analysis includes moving object detection/tracking and key frame selection which can browse useful video clips. The communication networking services, comprising video transcoding, multimedia messaging, and mobile video streaming, transmit surveillance information into mobile appliances. Moving object detection is achieved by background subtraction and particle filter tracking. Key frame selection, which aims to deliver an alarm to a mobile client using multimedia messaging service accompanied with an extracted clear frame, is reached by devising a weighted importance criterion considering object clarity and face appearance. Besides, a spatial- domain cascaded transcoder is developed to convert the filtered image sequence of detected objects into the mobile video streaming format. Experimental results show that the system can successfully detect all events of moving objects for a complex surveillance scene, choose very appropriate key frames for users, and transcode the images with a high power signal-to-noise ratio (PSNR).
基金We deeply acknowledge Taif University for Supporting and funding this study through Taif University Researchers Supporting Project number(TURSP-2020/115),Taif University,Taif,Saudi Arabia.
文摘:In recent years,video surveillance application played a significant role in our daily lives.Images taken during foggy and haze weather conditions for video surveillance application lose their authenticity and hence reduces the visibility.The reason behind visibility enhancement of foggy and haze images is to help numerous computer and machine vision applications such as satellite imagery,object detection,target killing,and surveillance.To remove fog and enhance visibility,a number of visibility enhancement algorithms and methods have been proposed in the past.However,these techniques suffer from several limitations that place strong obstacles to the real world outdoor computer vision applications.The existing techniques do not perform well when images contain heavy fog,large white region and strong atmospheric light.This research work proposed a new framework to defog and dehaze the image in order to enhance the visibility of foggy and haze images.The proposed framework is based on a Conditional generative adversarial network(CGAN)with two networks;generator and discriminator,each having distinct properties.The generator network generates fog-free images from foggy images and discriminator network distinguishes between the restored image and the original fog-free image.Experiments are conducted on FRIDA dataset and haze images.To assess the performance of the proposed method on fog dataset,we use PSNR and SSIM,and for Haze dataset use e,r−,andσas performance metrics.Experimental results shows that the proposed method achieved higher values of PSNR and SSIM which is 18.23,0.823 and lower values produced by the compared method which are 13.94,0.791 and so on.Experimental results demonstrated that the proposed framework Has removed fog and enhanced the visibility of foggy and hazy images.
文摘Generating ground truth data for developing object detection algorithms of intelligent surveillance systems is a considerably important yet time-consuming task; therefore, a user-friendly tool to annotate videos efficiently and accurately is required. In this paper, the development of a semi-automatic video annotation tool is described. For efficiency, the developed tool can automatically generate the initial annotation data for the input videos utilizing automatic object detection modules, which are developed independently and registered in the tool. To guarantee the accuracy of the ground truth data, the system also has several user-friendly functions to help users check and edit the initial annotation data generated by the automatic object detection modules. According to the experiment's results, employing the developed annotation tool is considerably beneficial for reducing annotation time; when compared to manual annotation schemes, using the tool resulted in an annotation time reduction of up to 2.3 times.
文摘In this paper,abnormal target detection and location in video surveillance system are studied.In recent years,with the rapid development of network information technology,video surveillance technology has been widely used,artificial anomaly detection methods have no way to meet the effective growth of video surveillance data,with 3D technology,face recognition technology,etc.,also promote the development of the field of computer vision,for the rapid analysis of a large number of video data to provide effective support.At present,abnormal target detection methods in video surveillance system mainly include the following two methods:One is to extract two-dimensional data features from video surveillance data,and effectively express video targets according to the extracted features.The information expressed mainly includes time information and spatial information.The second is to directly learn 3D space-time features for the module with motion information to detect the location of the abnormal target.Finally,the paper summarizes the full text and looks forward to the future development direction of video anomaly detection from three aspects:data set,method and evaluation index.
基金This research was supported by the Chung-Ang University Research Scholarship Grants in 2021 and the Culture,Sports and Tourism R&D Program through the Korea Creative Content Agency grant funded by the Ministry of Culture,Sports,and Tourism in 2022(Project Name:Development of Digital Quarantine and Operation Technologies for Creation of Safe Viewing Environment in Cultural Facilities,Project Number:R2021040028,Contribution Rate:100%).
文摘In the present technological world,surveillance cameras generate an immense amount of video data from various sources,making its scrutiny tough for computer vision specialists.It is difficult to search for anomalous events manually in thesemassive video records since they happen infrequently and with a low probability in real-world monitoring systems.Therefore,intelligent surveillance is a requirement of the modern day,as it enables the automatic identification of normal and aberrant behavior using artificial intelligence and computer vision technologies.In this article,we introduce an efficient Attention-based deep-learning approach for anomaly detection in surveillance video(ADSV).At the input of the ADSV,a shots boundary detection technique is used to segment prominent frames.Next,The Lightweight ConvolutionNeuralNetwork(LWCNN)model receives the segmented frames to extract spatial and temporal information from the intermediate layer.Following that,spatial and temporal features are learned using Long Short-Term Memory(LSTM)cells and Attention Network from a series of frames for each anomalous activity in a sample.To detect motion and action,the LWCNN received chronologically sorted frames.Finally,the anomaly activity in the video is identified using the proposed trained ADSV model.Extensive experiments are conducted on complex and challenging benchmark datasets.In addition,the experimental results have been compared to state-ofthe-artmethodologies,and a significant improvement is attained,demonstrating the efficiency of our ADSV method.
文摘For intelligent surveillance videos,anomaly detection is extremely important.Deep learning algorithms have been popular for evaluating realtime surveillance recordings,like traffic accidents,and criminal or unlawful incidents such as suicide attempts.Nevertheless,Deep learning methods for classification,like convolutional neural networks,necessitate a lot of computing power.Quantum computing is a branch of technology that solves abnormal and complex problems using quantum mechanics.As a result,the focus of this research is on developing a hybrid quantum computing model which is based on deep learning.This research develops a Quantum Computing-based Convolutional Neural Network(QC-CNN)to extract features and classify anomalies from surveillance footage.A Quantum-based Circuit,such as the real amplitude circuit,is utilized to improve the performance of the model.As far as my research,this is the first work to employ quantum deep learning techniques to classify anomalous events in video surveillance applications.There are 13 anomalies classified from the UCF-crime dataset.Based on experimental results,the proposed model is capable of efficiently classifying data concerning confusion matrix,Receiver Operating Characteristic(ROC),accuracy,Area Under Curve(AUC),precision,recall as well as F1-score.The proposed QC-CNN has attained the best accuracy of 95.65 percent which is 5.37%greater when compared to other existing models.To measure the efficiency of the proposed work,QC-CNN is also evaluated with classical and quantum models.
基金This work was supported by the CAAI-Huawei MindSpore Open Fund and the General Program of Natural Science Foundation of Fujian Province,China(No.2020J01473).
文摘Infant drowning has occurred frequently in swimming pools recent years,which motivates the research on automatic real-time detection of the accident.Unlike youths or adults,swimming infants are small in terms of size and motion range,and unable to send out distress signals in emergencies,which exerts negative effects on the detection of drowning.Aiming at this problem,a new step is initialized towards detecting infant drowning automatically and efficiently based on video surveillance.Diverse live-scene videos of infant swimming and drowning are collected from a variety of natatoriums and labeled as datasets.A part of the datasets is downscaled or enlarged to enhance generalization ability of the model.On this basis,advantages of Faster R-CNN and a series of YOLOv5 models are specifically explored to enable fast and accurate detection of infant drowning in real-world.Supervised learning experiments are carried out,model test results show that mean Average Precision(mAP)of either Faster R-CNN or YOLOv5s of the series of YOLOv5 can be over 89%;the former can process merely 6 frames of videos per second with the precision of only 62.04%,while the latter can reach an average speed of 75 frames/s with the precision of about 86.6%.The YOLOv5s eventually stands out as an optimal model for detecting infant drowning in view of comprehensive performance,which is of great application value to reduce the accidents in swimming pools.
文摘Video synopsis is an effective way to easily summarize long-recorded surveillance videos.The omnidirectional view allows the observer to select the desired fields of view(FoV)from the different FoVavailable for spherical surveillance video.By choosing to watch one portion,the observer misses out on the events occurring somewhere else in the spherical scene.This causes the observer to experience fear of missing out(FOMO).Hence,a novel personalized video synopsis approach for the generation of non-spherical videos has been introduced to address this issue.It also includes an action recognition module that makes it easy to display necessary actions by prioritizing them.This work minimizes and maximizes multiple goals such as loss of activity,collision,temporal consistency,length,show,and important action cost respectively.The performance of the proposed framework is evaluated through extensive simulation and compared with the state-of-art video synopsis optimization algorithms.Experimental results suggest that some constraints are better optimized by using the latest metaheuristic optimization algorithms to generate compact personalized synopsis videos from spherical surveillance videos.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R349)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.This study is supported via funding from Prince Sattam bin Abdulaziz University Project Number(PSAU/2023/R/1444).
文摘Face recognition technology automatically identifies an individual from image or video sources.The detection process can be done by attaining facial characteristics from the image of a subject face.Recent developments in deep learning(DL)and computer vision(CV)techniques enable the design of automated face recognition and tracking methods.This study presents a novel Harris Hawks Optimization with deep learning-empowered automated face detection and tracking(HHODL-AFDT)method.The proposed HHODL-AFDT model involves a Faster region based convolution neural network(RCNN)-based face detection model and HHO-based hyperparameter opti-mization process.The presented optimal Faster RCNN model precisely rec-ognizes the face and is passed into the face-tracking model using a regression network(REGN).The face tracking using the REGN model uses the fea-tures from neighboring frames and foresees the location of the target face in succeeding frames.The application of the HHO algorithm for optimal hyperparameter selection shows the novelty of the work.The experimental validation of the presented HHODL-AFDT algorithm is conducted using two datasets and the experiment outcomes highlighted the superior performance of the HHODL-AFDT model over current methodologies with maximum accuracy of 90.60%and 88.08%under PICS and VTB datasets,respectively.
基金supported by the National Natural Science Foundation of China(No.61502256)
文摘In this paper, we propose a video searching system that utilizes face recognition as searching indexing feature. As the applications of video cameras have great increase in recent years, face recognition makes a perfect fit for searching targeted individuals within the vast amount of video data. However, the performance of such searching depends on the quality of face images recorded in the video signals. Since the surveillance video cameras record videos without fixed postures for the object, face occlusion is very common in everyday video. The proposed system builds a model for occluded faces using fuzzy principal component analysis(FPCA), and reconstructs the human faces with the available information. Experimental results show that the system has very high efficiency in processing the real life videos, and it is very robust to various kinds of face occlusions. Hence it can relieve people reviewers from the front of the monitors and greatly enhances the efficiency as well. The proposed system has been installed and applied in various environments and has already demonstrated its power by helping solving real cases.
文摘Background Video anomaly detection has always been a hot topic and has attracted increasing attention.Many of the existing methods for video anomaly detection depend on processing the entire video rather than considering only the significant context. Method This paper proposes a novel video anomaly detection method called COVAD that mainly focuses on the region of interest in the video instead of the entire video. Our proposed COVAD method is based on an autoencoded convolutional neural network and a coordinated attention mechanism,which can effectively capture meaningful objects in the video and dependencies among different objects. Relying on the existing memory-guided video frame prediction network, our algorithm can significantly predict the future motion and appearance of objects in a video more effectively. Result The proposed algorithm obtained better experimental results on multiple datasets and outperformed the baseline models considered in our analysis. Simultaneously, we provide an improved visual test that can provide pixel-level anomaly explanations.
文摘Action recognition is an important topic in computer vision. Recently, deep learning technologies have been successfully used in lots of applications including video data for sloving recognition problems. However, most existing deep learning based recognition frameworks are not optimized for action in the surveillance videos. In this paper, we propose a novel method to deal with the recognition of different types of actions in outdoor surveillance videos. The proposed method first introduces motion compensation to improve the detection of human target. Then, it uses three different types of deep models with single and sequenced images as inputs for the recognition of different types of actions. Finally, predictions from different models are fused with a linear model. Experimental results show that the proposed method works well on the real surveillance videos.
基金The National Key Technology R&D Program of China during the 11th Five-Year Plan Period(No.2009BAG13A04)Jiangsu Transportation Science Research Program(No.08X09)Program of Suzhou Science and Technology(No.SG201076)
文摘This paper presents an urban expressway video surveillance and monitoring system for traffic flow measurement and abnormal performance detection. The proposed flow detection module collects traffic flow statistics in real time by leveraging multi-vehicle tracking information. Based on these online statistics, road operating situations can be easily obtained. Using spatiotemporal trajectories, vehicle motion paths are encoded by hidden Markov models. With path division and parameter matching, abnormal performances containing extra low or high speed driving, illegal stopping and turning are detected in real scenes. The traffic surveillance approach is implemented and evaluated on a DM642 DSP-based embedded platform. Experimental results demonstrate that the proposed system is feasible for the detection of vehicle speed, vehicle counts and road efficiency, and it is effective for the monitoring of the aforementioned anomalies with low computational costs.
基金the Framework of International Cooperation Program managed by the National Research Foundation of Korea(2019K1A3A1A8011295711).
文摘Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It allows the deployment of smart cameras or optical sensors with computer vision techniques,which may serve in several object detection and tracking tasks.These tasks have been considered challenging and high-level perceptual problems,frequently dominated by relative information about the environment,where main concerns such as occlusion,illumination,background,object deformation,and object class variations are commonplace.In order to show the importance of top view surveillance,a collaborative robotics framework has been presented.It can assist in the detection and tracking of multiple objects in top view surveillance.The framework consists of a smart robotic camera embedded with the visual processing unit.The existing pre-trained deep learning models named SSD and YOLO has been adopted for object detection and localization.The detection models are further combined with different tracking algorithms,including GOTURN,MEDIANFLOW,TLD,KCF,MIL,and BOOSTING.These algorithms,along with detection models,help to track and predict the trajectories of detected objects.The pre-trained models are employed;therefore,the generalization performance is also investigated through testing the models on various sequences of top view data set.The detection models achieved maximum True Detection Rate 93%to 90%with a maximum 0.6%False Detection Rate.The tracking results of different algorithms are nearly identical,with tracking accuracy ranging from 90%to 94%.Furthermore,a discussion has been carried out on output results along with future guidelines.