Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion s...Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper.展开更多
This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedes...This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedestrian detection algorithms during training. Furthermore, it is dedicated to addressing occlusion issues in pedestrian detection by using Azure Kinect for body tracking and integrating a robust occlusion management algorithm, significantly enhancing detection efficiency. In experiments, an average latency of 204 milliseconds was measured, and the detection accuracy reached an outstanding level of 97%. Additionally, this approach has been successfully applied in creating a simple yet captivating augmented reality game, demonstrating the practical application of the algorithm.展开更多
Real-time pedestrian detection is an important task for unmanned driving systems and video surveillance.The existing pedestrian detection methods often work at low speed and also fail to detect smaller and densely dis...Real-time pedestrian detection is an important task for unmanned driving systems and video surveillance.The existing pedestrian detection methods often work at low speed and also fail to detect smaller and densely distributed pedestrians by losing some of their detection accuracy in such cases.Therefore,the proposed algorithm YOLOv2(“YOU ONLY LOOK ONCE Version 2”)-based pedestrian detection(referred to as YOLOv2PD)would be more suitable for detecting smaller and densely distributed pedestrians in real-time complex road scenes.The proposed YOLOv2PD algorithm adopts a Multi-layer Feature Fusion(MLFF)strategy,which helps to improve the model’s feature extraction ability.In addition,one repeated convolution layer is removed from the final layer,which in turn reduces the computational complexity without losing any detection accuracy.The proposed algorithm applies the K-means clustering method on the Pascal Voc-2007+2012 pedestrian dataset before training to find the optimal anchor boxes.Both the proposed network structure and the loss function are improved to make the model more accurate and faster while detecting smaller pedestrians.Experimental results show that,at 544×544 image resolution,the proposed model achieves 80.7%average precision(AP),which is 2.1%higher than the YOLOv2 Model on the Pascal Voc-2007+2012 pedestrian dataset.Besides,based on the experimental results,the proposed model YOLOv2PD achieves a good trade-off balance between detection accuracy and real-time speed when evaluated on INRIA and Caltech test pedestrian datasets and achieves state-of-the-art detection results.展开更多
Focusing on data imbalance and intraclass variation,an improved pedestrian detection with a cascade of complex peer AdaBoost classifiers is proposed.The series of the AdaBoost classifiers are learned greedily,along wi...Focusing on data imbalance and intraclass variation,an improved pedestrian detection with a cascade of complex peer AdaBoost classifiers is proposed.The series of the AdaBoost classifiers are learned greedily,along with negative example mining.The complexity of classifiers in the cascade is not limited,so more negative examples are used for training.Furthermore,the cascade becomes an ensemble of strong peer classifiers,which treats intraclass variation.To locally train the AdaBoost classifiers with a high detection rate,a refining strategy is used to discard the hardest negative training examples rather than decreasing their thresholds.Using the aggregate channel feature(ACF),the method achieves miss rates of 35%and 14%on the Caltech pedestrian benchmark and Inria pedestrian dataset,respectively,which are lower than that of increasingly complex AdaBoost classifiers,i.e.,44%and 17%,respectively.Using deep features extracted by the region proposal network(RPN),the method achieves a miss rate of 10.06%on the Caltech pedestrian benchmark,which is also lower than 10.53%from the increasingly complex cascade.This study shows that the proposed method can use more negative examples to train the pedestrian detector.It outperforms the existing cascade of increasingly complex classifiers.展开更多
Nowadays,the rapid development of edge computing has driven an increasing number of deep learning applications deployed at the edge of the network,such as pedestrian and vehicle detection,to provide efficient intellig...Nowadays,the rapid development of edge computing has driven an increasing number of deep learning applications deployed at the edge of the network,such as pedestrian and vehicle detection,to provide efficient intelligent services to mobile users.However,as the accuracy requirements continue to increase,the components of deep learning models for pedestrian and vehicle detection,such as YOLOv4,become more sophisticated and the computing resources required for model training are increasing dramatically,which in turn leads to significant challenges in achieving effective deployment on resource-constrained edge devices while ensuring the high accuracy performance.For addressing this challenge,a cloud-edge collaboration-based pedestrian and vehicle detection framework is proposed in this paper,which enables sufficient training of models by utilizing the abundant computing resources in the cloud,and then deploying the well-trained models on edge devices,thus reducing the computing resource requirements for model training on edge devices.Furthermore,to reduce the size of the model deployed on edge devices,an automatic pruning method combines the convolution layer and BN layer is proposed to compress the pedestrian and vehicle detection model size.Experimental results show that the framework proposed in this paper is able to deploy the pruned model on a real edge device,Jetson TX2,with 6.72 times higher FPS.Meanwhile,the channel pruning reduces the volume and the number of parameters to 96.77%for the model,and the computing amount is reduced to 81.37%.展开更多
A real-time pedestrian detection and tracking system using a single video camera was developed to monitor pedestrians. This system contained six modules: video flow capture, pre-processing, movement detection, shadow ...A real-time pedestrian detection and tracking system using a single video camera was developed to monitor pedestrians. This system contained six modules: video flow capture, pre-processing, movement detection, shadow removal, tracking, and object classification. The Gaussian mixture model was utilized to extract the moving object from an image sequence segmented by the mean-shift technique in the pre-processing module. Shadow removal was used to alleviate the negative impact of the shadow to the detected objects. A model-free method was adopted to identify pedestrians. The maximum and minimum integration methods were developed to integrate multiple cues into the mean-shift algorithm and the initial tracking iteration with the competent integrated probability distribution map for object tracking. A simple but effective algorithm was proposed to handle full occlusion cases. The system was tested using real traffic videos from different sites. The results of the test confirm that the system is reliable and has an overall accuracy of over 85%.展开更多
In recent years,pedestrian detection is a hot research topic in the field of computer vision and artificial intelligence,it is widely used in the field of security and pedestrian analysis.However,due to a large amount...In recent years,pedestrian detection is a hot research topic in the field of computer vision and artificial intelligence,it is widely used in the field of security and pedestrian analysis.However,due to a large amount of calculation in the traditional pedestrian detection technology,the speed of many systems for pedestrian recognition is very limited.But in some restricted areas,such as construction hazardous areas,real-time detection of pedestrians and cross-border behaviors is required.To more conveniently and efficiently detect whether there are pedestrians in the restricted area and cross-border behavior,this paper proposes a pedestrian cross-border detection method based on HOG(Histogram of Oriented Gradient)and SVM(Support Vector Machine).This method extracts the moving target through the GMM(Gaussian Mixture Model)background modeling and then extracts the characteristics of the moving target through gradient HOG.Finally,it uses SVM training to distinguish pedestrians from non-pedestrians,completes the detection of pedestrians,and labels the targets.The test results show that only the HOG feature extraction of the candidate area can greatly reduce the amount of calculation and reduce the time of feature extraction,eliminate background interference,thereby improving the efficiency of detection,and can be applied to occasions with real-time requirements.展开更多
Pedestrian detection and tracking are vital elements of today’s surveillance systems,which make daily life safe for humans.Thus,human detection and visualization have become essential inventions in the field of compu...Pedestrian detection and tracking are vital elements of today’s surveillance systems,which make daily life safe for humans.Thus,human detection and visualization have become essential inventions in the field of computer vision.Hence,developing a surveillance system with multiple object recognition and tracking,especially in low light and night-time,is still challenging.Therefore,we propose a novel system based on machine learning and image processing to provide an efficient surveillance system for pedestrian detection and tracking at night.In particular,we propose a system that tackles a two-fold problem by detecting multiple pedestrians in infrared(IR)images using machine learning and tracking them using particle filters.Moreover,a random forest classifier is adopted for image segmentation to identify pedestrians in an image.The result of detection is investigated by particle filter to solve pedestrian tracking.Through the extensive experiment,our system shows 93%segmentation accuracy using a random forest algorithm that demonstrates high accuracy for background and roof classes.Moreover,the system achieved a detection accuracy of 90%usingmultiple templatematching techniques and 81%accuracy for pedestrian tracking.Furthermore,our system can identify that the detected object is a human.Hence,our system provided the best results compared to the state-ofart systems,which proves the effectiveness of the techniques used for image segmentation,classification,and tracking.The presented method is applicable for human detection/tracking,crowd analysis,and monitoring pedestrians in IR video surveillance.展开更多
In object detection, detecting an object with 100 pixels is substantially different from detecting an object with 10 pixels. Many object detection algorithms assume that the pedestrian scale is fixed during detection,...In object detection, detecting an object with 100 pixels is substantially different from detecting an object with 10 pixels. Many object detection algorithms assume that the pedestrian scale is fixed during detection, such as the DPM detector. However, detectors often give rise to different detection effects under the circumstance of different scales. If a detector is used to perform pedestrian detection in different scales, the accuracy of pedestrian detection could be improved. A multi-resolution DPM pedestrian detection algorithm is proposed in this paper. During the stage of model training, a resolution factor is added to a set of hidden variables of a latent SVM model. Then, in the stage of detection, a standard DPM model is used for the high resolution objects and a rigid template is adopted in case of the low resolution objects. In our experiments, we find that in case of low resolution objects the detection accuracy of a standard DPM model is lower than that of a rigid template. In Caltech, the omission ratio of a multi-resolution DPM detector is 52% with 1 false positive per image (1FPPI);and the omission ratio rises to 59% (1FPPI) as far as a standard DPM detector is concerned. In the large-scale sample set of Caltech, the omission ratios given by the multi-resolution and the standard DPM detectors are 18% (1FPPI) and 26% (1FPPI), respectively.展开更多
In order to avoid the problem of poor illumination characteristics and inaccurate positioning accuracy, this paper proposed a pedestrian detection algorithm suitable for low-light environments. The algorithm first app...In order to avoid the problem of poor illumination characteristics and inaccurate positioning accuracy, this paper proposed a pedestrian detection algorithm suitable for low-light environments. The algorithm first applied the multi-scale Retinex image enhancement algorithm to the sample pre-processing of deep learning to improve the image resolution. Then the paper used the faster regional convolutional neural network to train the pedestrian detection model, extracted the pedestrian characteristics, and obtained the bounding boxes through classification and position regression. Finally, the pedestrian detection process was carried out by introducing the Soft-NMS algorithm, and the redundant bounding box was eliminated to obtain the best pedestrian detection position. The experimental results showed that the proposed detection algorithm achieves an average accuracy of 89.74% on the low-light dataset, and the pedestrian detection effect was more significant.展开更多
Presently,video surveillance is commonly employed to ensure security in public places such as traffic signals,malls,railway stations,etc.A major chal-lenge in video surveillance is the identification of anomalies that...Presently,video surveillance is commonly employed to ensure security in public places such as traffic signals,malls,railway stations,etc.A major chal-lenge in video surveillance is the identification of anomalies that exist in it such as crimes,thefts,and so on.Besides,the anomaly detection in pedestrian walkways has gained significant attention among the computer vision communities to enhance pedestrian safety.The recent advances of Deep Learning(DL)models have received considerable attention in different processes such as object detec-tion,image classification,etc.In this aspect,this article designs a new Panoptic Feature Pyramid Network based Anomaly Detection and Tracking(PFPN-ADT)model for pedestrian walkways.The proposed model majorly aims to the recognition and classification of different anomalies present in the pedestrian walkway like vehicles,skaters,etc.The proposed model involves panoptic seg-mentation model,called Panoptic Feature Pyramid Network(PFPN)is employed for the object recognition process.For object classification,Compact Bat Algo-rithm(CBA)with Stacked Auto Encoder(SAE)is applied for the classification of recognized objects.For ensuring the enhanced results better anomaly detection performance of the PFPN-ADT technique,a comparison study is made using Uni-versity of California San Diego(UCSD)Anomaly data and other benchmark data-sets(such as Cityscapes,ADE20K,COCO),and the outcomes are compared with the Mask Recurrent Convolutional Neural Network(RCNN)and Faster Convolu-tional Neural Network(CNN)models.The simulation outcome demonstrated the enhanced performance of the PFPN-ADT technique over the other methods.展开更多
This paper proposes a vision-based pedestrian detection in crowded situations based on a single camera. The main idea behind our work is to fuse multiple cues so that the major challenges, such as occlusion and comple...This paper proposes a vision-based pedestrian detection in crowded situations based on a single camera. The main idea behind our work is to fuse multiple cues so that the major challenges, such as occlusion and complex background facing in the topic of crowd detection can be successfully overcome. Based on the assumption that human heads are visible, circle Hough transform (CHT) is applied to detect all circular regions and each of which is considered as the head candidate of a pedestrian. After that, the false candidates resulting from complex background are firstly removed by using template matching algorithm. Two proposed cues called head foreground contrast (HFC) and block color relation (BCR) are incorporated for further verification. The rectangular region of every detected human is determined by the geometric relationships as well as foreground mask extracted through background subtraction process. Three videos are used to validate the proposed approach and the experimental results show that the proposed method effectively lowers the false positives at the expense of little detection rate.展开更多
This study proposes a motion cue based pedestrian detection method with two-trame-filtering (Tff) for video surveillance. The novel motion cue is exploited by the gray value variation between two frames. Then Tff pr...This study proposes a motion cue based pedestrian detection method with two-trame-filtering (Tff) for video surveillance. The novel motion cue is exploited by the gray value variation between two frames. Then Tff processing filters the gradient magnitude image by the variation map. Summa- tions of the Tff gradient magnitudes in cells are applied to train a pre-deteetor to exclude most of the background regions. Histogram of Tff oriented gradient (HTffOG) feature is proposed for pedestrian detection. Experimental results show that this method is effective and suitable for real-time surveil- lance applications.展开更多
The main purpose of YOLOv3,aiming to improve the detection speed and accuracy from current detection models,is to predict the center coordinates of(x,y)from the Bounding Box and its length,width through multiple layer...The main purpose of YOLOv3,aiming to improve the detection speed and accuracy from current detection models,is to predict the center coordinates of(x,y)from the Bounding Box and its length,width through multiple layers of VGG Convolutional Neural Network(VGG-CNN)and uses the Darknet lightweight framework to process images at a faster speed.More specifically,our model has been reduced part of YOLOv3's complex and computationally intensive procedures and improved its algorithms to maintain the efficiency and accuracy of object detection.By this method,it performs a higher quality on mass object detection tasks with fewer detection errors.展开更多
Pedestrian detection is a critical challenge in the field of general object detection,the performance of object detection has advanced with the development of deep learning.However,considerable improvement is still re...Pedestrian detection is a critical challenge in the field of general object detection,the performance of object detection has advanced with the development of deep learning.However,considerable improvement is still required for pedestrian detection,considering the differences in pedestrian wears,action,and posture.In the driver assistance system,it is necessary to further improve the intelligent pedestrian detection ability.We present a method based on the combination of SSD and GAN to improve the performance of pedestrian detection.Firstly,we assess the impact of different kinds of methods which can detect pedestrians based on SSD and optimize the detection for pedestrian characteristics.Secondly,we propose a novel network architecture,namely data synthesis PS-GAN to generate diverse pedestrian data for verifying the effectiveness of massive training data to SSD detector.Experimental results show that the proposed manners can improve the performance of pedestrian detection to some extent.At last,we use the pedestrian detector to simulate a specific application of motor vehicle assisted driving which would make the detector focus on specific pedestrians according to the velocity of the vehicle.The results establish the validity of the approach.展开更多
Pedestrian detection has a wide range of applications in daily life, and many fields require pedestrians to conduct detection with high precision and speed, which is an urgent problem to be solved. The traditional ped...Pedestrian detection has a wide range of applications in daily life, and many fields require pedestrians to conduct detection with high precision and speed, which is an urgent problem to be solved. The traditional pedestrian detection method improves the detection performance by improving the classification algorithm and extracting more effective features. In this paper, a pedestrian detection method is proposed based on single shot multibox detector (SSD) model, which replaces the basic network part of SSD model with inception network structure with smaller parameters, faster running speed and stronger nonlinear expression ability. A high-performance network model for pedestrian detection was based on improved SSD. The experimental results show that the proposed method is faster than the original model, and the average precision of pedestrian recognition and location is 89.6%, which is 2.6% higher than the original model.展开更多
Vision-based player recognition is critical in sports applications.Accuracy,efficiency,and Low memory utilization is alluring for ongoing errands,for example,astute communicates and occasion classification.We develope...Vision-based player recognition is critical in sports applications.Accuracy,efficiency,and Low memory utilization is alluring for ongoing errands,for example,astute communicates and occasion classification.We developed an algorithm that tracks the movements of different players from a video of a basketball game.With their position tracked,we then proceed to map the position of these players onto an image of a basketball court.The purpose of tracking player is to provide the maximum amount of information to basketball coaches and organizations,so that they can better design mechanisms of defence and attack.Overall,our model has a high degree of identification and tracking of the players in the court.We directed investigations on soccer,basketball,ice hockey and pedestrian datasets.The trial comes about an exhibit that our technique can precisely recognize players under testing conditions.Contrasted and CNNs that are adjusted from general question identification systems,for example,Faster-RCNN,our approach accomplishes cutting edge exactness on three sorts of recreations(basketball,soccer and ice hockey)with 1000×fewer parameters.The all-inclusive statement of our technique is additionally shown on a standard passer-by recognition dataset in which our strategy accomplishes aggressive execution contrasted and cutting-edge methods.展开更多
针对实时行人检测场景存在遮挡、形态姿势不同的行人目标,YOLOv5模型对于这些目标检测有明显的漏检问题,提出一种像素差异度注意力机制(pixel difference attention,PDA),不同于传统的通道注意力机制用全局均值池化(global average pool...针对实时行人检测场景存在遮挡、形态姿势不同的行人目标,YOLOv5模型对于这些目标检测有明显的漏检问题,提出一种像素差异度注意力机制(pixel difference attention,PDA),不同于传统的通道注意力机制用全局均值池化(global average pooling,GAP)、全局最大值池化(global max pooling,GMP)来概括整张特征图的信息,全局池化将空间压缩成一个值来表征整个通道,造成了空间信息的流失,PDA将空间信息沿高和宽分别压缩,并将其分别与通道信息联系起来做注意力加权操作,同时提出一种新的通道描述指标表征通道信息,增强空间信息与通道信息的交互,使模型更容易关注到综合了空间和通道维度上的特征图的重要信息,在主干网络末端插入PDA后使模型平均精度(mean average precision,mAP)0.5提升了2.4个百分点,mAP0.5:0.95提升了4.4个百分点;针对实时检测场景的部署和检测速度要求模型拥有较少的参数量和计算量,因此提出了新的轻量化特征提取模块AC3代替原YOLOv5模型中的C3模块,该模块使插入PDA后的改进模型在精度仅仅损失0.2个百分点的情况下,参数量(parameters,Param.)减少了20%左右,浮点运算量(giga floating-point operations,GFLOPs)减少了30%左右。实验结果表明,最终的改进模型比YOLOv5s原模型在VOC行人数据集上mAP0.5提升了2.2个百分点,mAP0.5:0.95提升了3.1个百分点,且参数量减少了20%左右,浮点运算量减少了30%左右,在GTX1050上的检测速度(frames per second,FPS)提升了4。展开更多
基金supported by the Henan Provincial Science and Technology Research Project under Grants 232102211006,232102210044,232102211017,232102210055 and 222102210214the Science and Technology Innovation Project of Zhengzhou University of Light Industry under Grant 23XNKJTD0205+1 种基金the Undergraduate Universities Smart Teaching Special Research Project of Henan Province under Grant Jiao Gao[2021]No.489-29the Doctor Natural Science Foundation of Zhengzhou University of Light Industry under Grants 2021BSJJ025 and 2022BSJJZK13.
文摘Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper.
文摘This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedestrian detection algorithms during training. Furthermore, it is dedicated to addressing occlusion issues in pedestrian detection by using Azure Kinect for body tracking and integrating a robust occlusion management algorithm, significantly enhancing detection efficiency. In experiments, an average latency of 204 milliseconds was measured, and the detection accuracy reached an outstanding level of 97%. Additionally, this approach has been successfully applied in creating a simple yet captivating augmented reality game, demonstrating the practical application of the algorithm.
基金The authors are grateful to the Deanship of Scientific Research,King Saud University,Riyadh,Saudi Arabia,for funding this work through the Vice Deanship of Scientific Research Chairs:Research Chair of Pervasive and Mobile Computing.
文摘Real-time pedestrian detection is an important task for unmanned driving systems and video surveillance.The existing pedestrian detection methods often work at low speed and also fail to detect smaller and densely distributed pedestrians by losing some of their detection accuracy in such cases.Therefore,the proposed algorithm YOLOv2(“YOU ONLY LOOK ONCE Version 2”)-based pedestrian detection(referred to as YOLOv2PD)would be more suitable for detecting smaller and densely distributed pedestrians in real-time complex road scenes.The proposed YOLOv2PD algorithm adopts a Multi-layer Feature Fusion(MLFF)strategy,which helps to improve the model’s feature extraction ability.In addition,one repeated convolution layer is removed from the final layer,which in turn reduces the computational complexity without losing any detection accuracy.The proposed algorithm applies the K-means clustering method on the Pascal Voc-2007+2012 pedestrian dataset before training to find the optimal anchor boxes.Both the proposed network structure and the loss function are improved to make the model more accurate and faster while detecting smaller pedestrians.Experimental results show that,at 544×544 image resolution,the proposed model achieves 80.7%average precision(AP),which is 2.1%higher than the YOLOv2 Model on the Pascal Voc-2007+2012 pedestrian dataset.Besides,based on the experimental results,the proposed model YOLOv2PD achieves a good trade-off balance between detection accuracy and real-time speed when evaluated on INRIA and Caltech test pedestrian datasets and achieves state-of-the-art detection results.
基金Project(2018AAA0102102)supported by the National Science and Technology Major Project,ChinaProject(2017WK2074)supported by the Planned Science and Technology Project of Hunan Province,China+1 种基金Project(B18059)supported by the National 111 Project,ChinaProject(61702559)supported by the National Natural Science Foundation of China。
文摘Focusing on data imbalance and intraclass variation,an improved pedestrian detection with a cascade of complex peer AdaBoost classifiers is proposed.The series of the AdaBoost classifiers are learned greedily,along with negative example mining.The complexity of classifiers in the cascade is not limited,so more negative examples are used for training.Furthermore,the cascade becomes an ensemble of strong peer classifiers,which treats intraclass variation.To locally train the AdaBoost classifiers with a high detection rate,a refining strategy is used to discard the hardest negative training examples rather than decreasing their thresholds.Using the aggregate channel feature(ACF),the method achieves miss rates of 35%and 14%on the Caltech pedestrian benchmark and Inria pedestrian dataset,respectively,which are lower than that of increasingly complex AdaBoost classifiers,i.e.,44%and 17%,respectively.Using deep features extracted by the region proposal network(RPN),the method achieves a miss rate of 10.06%on the Caltech pedestrian benchmark,which is also lower than 10.53%from the increasingly complex cascade.This study shows that the proposed method can use more negative examples to train the pedestrian detector.It outperforms the existing cascade of increasingly complex classifiers.
基金supported by Key-Area Research and Development Program of Guangdong Province(2021B0101420002)the Major Key Project of PCL(PCL2021A09)+3 种基金National Natural Science Foundation of China(62072187)Guangdong Major Project of Basic and Applied Basic Research(2019B030302002)Guangdong Marine Economic Development Special Fund Project(GDNRC[2022]17)Guangzhou Development Zone Science and Technology(2021GH10,2020GH10).
文摘Nowadays,the rapid development of edge computing has driven an increasing number of deep learning applications deployed at the edge of the network,such as pedestrian and vehicle detection,to provide efficient intelligent services to mobile users.However,as the accuracy requirements continue to increase,the components of deep learning models for pedestrian and vehicle detection,such as YOLOv4,become more sophisticated and the computing resources required for model training are increasing dramatically,which in turn leads to significant challenges in achieving effective deployment on resource-constrained edge devices while ensuring the high accuracy performance.For addressing this challenge,a cloud-edge collaboration-based pedestrian and vehicle detection framework is proposed in this paper,which enables sufficient training of models by utilizing the abundant computing resources in the cloud,and then deploying the well-trained models on edge devices,thus reducing the computing resource requirements for model training on edge devices.Furthermore,to reduce the size of the model deployed on edge devices,an automatic pruning method combines the convolution layer and BN layer is proposed to compress the pedestrian and vehicle detection model size.Experimental results show that the framework proposed in this paper is able to deploy the pruned model on a real edge device,Jetson TX2,with 6.72 times higher FPS.Meanwhile,the channel pruning reduces the volume and the number of parameters to 96.77%for the model,and the computing amount is reduced to 81.37%.
基金Project(50778015)supported by the National Natural Science Foundation of ChinaProject(2012CB725403)supported by the Major State Basic Research Development Program of China
文摘A real-time pedestrian detection and tracking system using a single video camera was developed to monitor pedestrians. This system contained six modules: video flow capture, pre-processing, movement detection, shadow removal, tracking, and object classification. The Gaussian mixture model was utilized to extract the moving object from an image sequence segmented by the mean-shift technique in the pre-processing module. Shadow removal was used to alleviate the negative impact of the shadow to the detected objects. A model-free method was adopted to identify pedestrians. The maximum and minimum integration methods were developed to integrate multiple cues into the mean-shift algorithm and the initial tracking iteration with the competent integrated probability distribution map for object tracking. A simple but effective algorithm was proposed to handle full occlusion cases. The system was tested using real traffic videos from different sites. The results of the test confirm that the system is reliable and has an overall accuracy of over 85%.
基金This work was supported by the National Nature Science Foundation of China(Grant Nos.61702347,61972267,61772225)Natural Science Foundation of Hebei Province(Grant Nos.F2017210161,F2018210148)。
文摘In recent years,pedestrian detection is a hot research topic in the field of computer vision and artificial intelligence,it is widely used in the field of security and pedestrian analysis.However,due to a large amount of calculation in the traditional pedestrian detection technology,the speed of many systems for pedestrian recognition is very limited.But in some restricted areas,such as construction hazardous areas,real-time detection of pedestrians and cross-border behaviors is required.To more conveniently and efficiently detect whether there are pedestrians in the restricted area and cross-border behavior,this paper proposes a pedestrian cross-border detection method based on HOG(Histogram of Oriented Gradient)and SVM(Support Vector Machine).This method extracts the moving target through the GMM(Gaussian Mixture Model)background modeling and then extracts the characteristics of the moving target through gradient HOG.Finally,it uses SVM training to distinguish pedestrians from non-pedestrians,completes the detection of pedestrians,and labels the targets.The test results show that only the HOG feature extraction of the candidate area can greatly reduce the amount of calculation and reduce the time of feature extraction,eliminate background interference,thereby improving the efficiency of detection,and can be applied to occasions with real-time requirements.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation)+2 种基金Also,this work was partially supported by the Taif University Researchers Supporting Project Number(TURSP-2020/115)Taif University,Taif,Saudi Arabia.This work was also supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2023R239)PrincessNourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Pedestrian detection and tracking are vital elements of today’s surveillance systems,which make daily life safe for humans.Thus,human detection and visualization have become essential inventions in the field of computer vision.Hence,developing a surveillance system with multiple object recognition and tracking,especially in low light and night-time,is still challenging.Therefore,we propose a novel system based on machine learning and image processing to provide an efficient surveillance system for pedestrian detection and tracking at night.In particular,we propose a system that tackles a two-fold problem by detecting multiple pedestrians in infrared(IR)images using machine learning and tracking them using particle filters.Moreover,a random forest classifier is adopted for image segmentation to identify pedestrians in an image.The result of detection is investigated by particle filter to solve pedestrian tracking.Through the extensive experiment,our system shows 93%segmentation accuracy using a random forest algorithm that demonstrates high accuracy for background and roof classes.Moreover,the system achieved a detection accuracy of 90%usingmultiple templatematching techniques and 81%accuracy for pedestrian tracking.Furthermore,our system can identify that the detected object is a human.Hence,our system provided the best results compared to the state-ofart systems,which proves the effectiveness of the techniques used for image segmentation,classification,and tracking.The presented method is applicable for human detection/tracking,crowd analysis,and monitoring pedestrians in IR video surveillance.
文摘In object detection, detecting an object with 100 pixels is substantially different from detecting an object with 10 pixels. Many object detection algorithms assume that the pedestrian scale is fixed during detection, such as the DPM detector. However, detectors often give rise to different detection effects under the circumstance of different scales. If a detector is used to perform pedestrian detection in different scales, the accuracy of pedestrian detection could be improved. A multi-resolution DPM pedestrian detection algorithm is proposed in this paper. During the stage of model training, a resolution factor is added to a set of hidden variables of a latent SVM model. Then, in the stage of detection, a standard DPM model is used for the high resolution objects and a rigid template is adopted in case of the low resolution objects. In our experiments, we find that in case of low resolution objects the detection accuracy of a standard DPM model is lower than that of a rigid template. In Caltech, the omission ratio of a multi-resolution DPM detector is 52% with 1 false positive per image (1FPPI);and the omission ratio rises to 59% (1FPPI) as far as a standard DPM detector is concerned. In the large-scale sample set of Caltech, the omission ratios given by the multi-resolution and the standard DPM detectors are 18% (1FPPI) and 26% (1FPPI), respectively.
文摘In order to avoid the problem of poor illumination characteristics and inaccurate positioning accuracy, this paper proposed a pedestrian detection algorithm suitable for low-light environments. The algorithm first applied the multi-scale Retinex image enhancement algorithm to the sample pre-processing of deep learning to improve the image resolution. Then the paper used the faster regional convolutional neural network to train the pedestrian detection model, extracted the pedestrian characteristics, and obtained the bounding boxes through classification and position regression. Finally, the pedestrian detection process was carried out by introducing the Soft-NMS algorithm, and the redundant bounding box was eliminated to obtain the best pedestrian detection position. The experimental results showed that the proposed detection algorithm achieves an average accuracy of 89.74% on the low-light dataset, and the pedestrian detection effect was more significant.
文摘Presently,video surveillance is commonly employed to ensure security in public places such as traffic signals,malls,railway stations,etc.A major chal-lenge in video surveillance is the identification of anomalies that exist in it such as crimes,thefts,and so on.Besides,the anomaly detection in pedestrian walkways has gained significant attention among the computer vision communities to enhance pedestrian safety.The recent advances of Deep Learning(DL)models have received considerable attention in different processes such as object detec-tion,image classification,etc.In this aspect,this article designs a new Panoptic Feature Pyramid Network based Anomaly Detection and Tracking(PFPN-ADT)model for pedestrian walkways.The proposed model majorly aims to the recognition and classification of different anomalies present in the pedestrian walkway like vehicles,skaters,etc.The proposed model involves panoptic seg-mentation model,called Panoptic Feature Pyramid Network(PFPN)is employed for the object recognition process.For object classification,Compact Bat Algo-rithm(CBA)with Stacked Auto Encoder(SAE)is applied for the classification of recognized objects.For ensuring the enhanced results better anomaly detection performance of the PFPN-ADT technique,a comparison study is made using Uni-versity of California San Diego(UCSD)Anomaly data and other benchmark data-sets(such as Cityscapes,ADE20K,COCO),and the outcomes are compared with the Mask Recurrent Convolutional Neural Network(RCNN)and Faster Convolu-tional Neural Network(CNN)models.The simulation outcome demonstrated the enhanced performance of the PFPN-ADT technique over the other methods.
文摘This paper proposes a vision-based pedestrian detection in crowded situations based on a single camera. The main idea behind our work is to fuse multiple cues so that the major challenges, such as occlusion and complex background facing in the topic of crowd detection can be successfully overcome. Based on the assumption that human heads are visible, circle Hough transform (CHT) is applied to detect all circular regions and each of which is considered as the head candidate of a pedestrian. After that, the false candidates resulting from complex background are firstly removed by using template matching algorithm. Two proposed cues called head foreground contrast (HFC) and block color relation (BCR) are incorporated for further verification. The rectangular region of every detected human is determined by the geometric relationships as well as foreground mask extracted through background subtraction process. Three videos are used to validate the proposed approach and the experimental results show that the proposed method effectively lowers the false positives at the expense of little detection rate.
基金Supported by the National High Technology Research and Development Program of China(No.2007AA01Z164)the National Natural Science Foundation of China(No.61273258)
文摘This study proposes a motion cue based pedestrian detection method with two-trame-filtering (Tff) for video surveillance. The novel motion cue is exploited by the gray value variation between two frames. Then Tff processing filters the gradient magnitude image by the variation map. Summa- tions of the Tff gradient magnitudes in cells are applied to train a pre-deteetor to exclude most of the background regions. Histogram of Tff oriented gradient (HTffOG) feature is proposed for pedestrian detection. Experimental results show that this method is effective and suitable for real-time surveil- lance applications.
文摘The main purpose of YOLOv3,aiming to improve the detection speed and accuracy from current detection models,is to predict the center coordinates of(x,y)from the Bounding Box and its length,width through multiple layers of VGG Convolutional Neural Network(VGG-CNN)and uses the Darknet lightweight framework to process images at a faster speed.More specifically,our model has been reduced part of YOLOv3's complex and computationally intensive procedures and improved its algorithms to maintain the efficiency and accuracy of object detection.By this method,it performs a higher quality on mass object detection tasks with fewer detection errors.
文摘Pedestrian detection is a critical challenge in the field of general object detection,the performance of object detection has advanced with the development of deep learning.However,considerable improvement is still required for pedestrian detection,considering the differences in pedestrian wears,action,and posture.In the driver assistance system,it is necessary to further improve the intelligent pedestrian detection ability.We present a method based on the combination of SSD and GAN to improve the performance of pedestrian detection.Firstly,we assess the impact of different kinds of methods which can detect pedestrians based on SSD and optimize the detection for pedestrian characteristics.Secondly,we propose a novel network architecture,namely data synthesis PS-GAN to generate diverse pedestrian data for verifying the effectiveness of massive training data to SSD detector.Experimental results show that the proposed manners can improve the performance of pedestrian detection to some extent.At last,we use the pedestrian detector to simulate a specific application of motor vehicle assisted driving which would make the detector focus on specific pedestrians according to the velocity of the vehicle.The results establish the validity of the approach.
文摘Pedestrian detection has a wide range of applications in daily life, and many fields require pedestrians to conduct detection with high precision and speed, which is an urgent problem to be solved. The traditional pedestrian detection method improves the detection performance by improving the classification algorithm and extracting more effective features. In this paper, a pedestrian detection method is proposed based on single shot multibox detector (SSD) model, which replaces the basic network part of SSD model with inception network structure with smaller parameters, faster running speed and stronger nonlinear expression ability. A high-performance network model for pedestrian detection was based on improved SSD. The experimental results show that the proposed method is faster than the original model, and the average precision of pedestrian recognition and location is 89.6%, which is 2.6% higher than the original model.
文摘Vision-based player recognition is critical in sports applications.Accuracy,efficiency,and Low memory utilization is alluring for ongoing errands,for example,astute communicates and occasion classification.We developed an algorithm that tracks the movements of different players from a video of a basketball game.With their position tracked,we then proceed to map the position of these players onto an image of a basketball court.The purpose of tracking player is to provide the maximum amount of information to basketball coaches and organizations,so that they can better design mechanisms of defence and attack.Overall,our model has a high degree of identification and tracking of the players in the court.We directed investigations on soccer,basketball,ice hockey and pedestrian datasets.The trial comes about an exhibit that our technique can precisely recognize players under testing conditions.Contrasted and CNNs that are adjusted from general question identification systems,for example,Faster-RCNN,our approach accomplishes cutting edge exactness on three sorts of recreations(basketball,soccer and ice hockey)with 1000×fewer parameters.The all-inclusive statement of our technique is additionally shown on a standard passer-by recognition dataset in which our strategy accomplishes aggressive execution contrasted and cutting-edge methods.
文摘针对实时行人检测场景存在遮挡、形态姿势不同的行人目标,YOLOv5模型对于这些目标检测有明显的漏检问题,提出一种像素差异度注意力机制(pixel difference attention,PDA),不同于传统的通道注意力机制用全局均值池化(global average pooling,GAP)、全局最大值池化(global max pooling,GMP)来概括整张特征图的信息,全局池化将空间压缩成一个值来表征整个通道,造成了空间信息的流失,PDA将空间信息沿高和宽分别压缩,并将其分别与通道信息联系起来做注意力加权操作,同时提出一种新的通道描述指标表征通道信息,增强空间信息与通道信息的交互,使模型更容易关注到综合了空间和通道维度上的特征图的重要信息,在主干网络末端插入PDA后使模型平均精度(mean average precision,mAP)0.5提升了2.4个百分点,mAP0.5:0.95提升了4.4个百分点;针对实时检测场景的部署和检测速度要求模型拥有较少的参数量和计算量,因此提出了新的轻量化特征提取模块AC3代替原YOLOv5模型中的C3模块,该模块使插入PDA后的改进模型在精度仅仅损失0.2个百分点的情况下,参数量(parameters,Param.)减少了20%左右,浮点运算量(giga floating-point operations,GFLOPs)减少了30%左右。实验结果表明,最终的改进模型比YOLOv5s原模型在VOC行人数据集上mAP0.5提升了2.2个百分点,mAP0.5:0.95提升了3.1个百分点,且参数量减少了20%左右,浮点运算量减少了30%左右,在GTX1050上的检测速度(frames per second,FPS)提升了4。