An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method...An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method using the object models is proposed to track multiple objects in a real-time visual surveillance system. Firstly, for detecting objects, an adaptive kernel density estimation method is utilized, which uses an adaptive bandwidth and features combining colour and gradient. Secondly, some models of objects are built for describing motion, shape and colour features. Then, a matching matrix is formed to analyze tracking situations. If objects are tracked under occlusions, the optimal "visual" object is found to represent the occluded object, and the posterior probability of pixel is used to determine which pixel is utilized for updating object models. Extensive experiments show that this method improves the accuracy and validity of tracking objects even under occlusions and is used in real-time visual surveillance systems.展开更多
Construction of high resolution images from low resolution sequences is often im- portant in surveillance applications. In this letter, an affine based multi-scale block-matching image registration algorithm is first ...Construction of high resolution images from low resolution sequences is often im- portant in surveillance applications. In this letter, an affine based multi-scale block-matching image registration algorithm is first proposed. The images to be registered are divided into overlapped blocks of different size according to its motions. The Least Square (LS) image reg- istration algorithm is extended to match the blocks. Then an object based Super Resolution (SR) scheme is designed, the Maximum A Priori (MAP) super resolution algorithm is extended to enhance the resolution of the interest objects. Experimental results show that the proposed multi-scale registration method provides more accurate registration between frames. Further more, the object based super resolution scheme shows an enhanced performance compared with the traditional MAP method.展开更多
A frequent trajectory patterns mining algorithm is proposed to learn the object activities and classify the trajectories in intelligent visual surveillance system.The distribution patterns of the trajectories were gen...A frequent trajectory patterns mining algorithm is proposed to learn the object activities and classify the trajectories in intelligent visual surveillance system.The distribution patterns of the trajectories were generated by an Apriori based frequent patterns mining algorithm and the trajectories were classified by the frequent trajectory patterns generated.In addition,a fuzzy c-means(FCM)based learning algorithm and a mean shift based clustering procedure were used to construct the representation of trajectories.The algorithm can be further used to describe activities and identify anomalies.The experiments on two real scenes show that the algorithm is effective.展开更多
In the context of multiple-target tracking and surveillance applications,this paper investigates the challenge of determining the optimal positioning of a single autonomous aerial vehicle or agent equipped with multip...In the context of multiple-target tracking and surveillance applications,this paper investigates the challenge of determining the optimal positioning of a single autonomous aerial vehicle or agent equipped with multiple independently-steerable zooming cameras to effectively monitor a set of targets of interest.Each camera is dedicated to tracking a specific target or cluster of targets.The key innovation of this study,in comparison to existing approaches,lies in incorporating the zooming factor for the onboard cameras into the optimization problem.This enhancement offers greater flexibility during mission execution by allowing the autonomous agent to adjust the focal lengths of the onboard cameras,in exchange for varying real-world distances to the corresponding targets,thereby providing additional degrees of freedom to the optimization problem.The proposed optimization framework aims to strike a balance among various factors,including distance to the targets,verticality of viewpoints,and the required focal length for each camera.The primary focus of this paper is to establish the theoretical groundwork for addressing the non-convex nature of the optimization problem arising from these considerations.To this end,we develop an original convex approximation strategy.The paper also includes simulations of diverse scenarios,featuring varying numbers of onboard tracking cameras and target motion profiles,to validate the effectiveness of the proposed approach.展开更多
Crowd density estimation in wide areas is a challenging problem for visual surveillance. Because of the high risk of degeneration, the safety of public events involving large crowds has always been a major concern. In...Crowd density estimation in wide areas is a challenging problem for visual surveillance. Because of the high risk of degeneration, the safety of public events involving large crowds has always been a major concern. In this paper, we propose a video-based crowd density analysis and prediction system for wide-area surveillance applications. In monocular image sequences, the Accumulated Mosaic Image Difference (AMID) method is applied to extract crowd areas having irregular motion. The specific number of persons and velocity of a crowd can be adequately estimated by our system from the density of crowded areas. Using a multi-camera network, we can obtain predictions of a crowd's density several minutes in advance. The system has been used in real applications, and numerous experiments conducted in real scenes (station, park, plaza) demonstrate the effectiveness and robustness of the proposed method.展开更多
Construction of high resolution images from low resolution sequences having rigid or semi-rigid ob-jects with unified motions is often important in surveillance and other applications.In this paper a novelobject-based...Construction of high resolution images from low resolution sequences having rigid or semi-rigid ob-jects with unified motions is often important in surveillance and other applications.In this paper a novelobject-based super resolution reconstruction scheme was proposed,in which a six-parameter affine model-based object tracking and registration method was first used to segment and match objects among a se-quence of low resolution frames.The motion model was then further extended to the traditional maximuma posterior(MAP)super resolution algorithm.The proposed object tracking and registration method wasevaluated by both simulated and real acquired sequences.The results have demonstrated the high accura-cy of the proposed object based method and the enhanced reconstruction performance of the extended ap-proach.展开更多
Detecting objects of interest from a video sequence is a fundamental and critical task in automated visual surveillance. Most current approaches only focus on discriminating moving objects by background subtraction wh...Detecting objects of interest from a video sequence is a fundamental and critical task in automated visual surveillance. Most current approaches only focus on discriminating moving objects by background subtraction whether or not the objects of interest can be moving or stationary. In this paper, we propose layers segmentation to detect both moving and stationary target objects from surveillance video. We extend the Maximum Entropy (ME) statistical model to segment layers with features, which are collected by constructing a codebook with a set of codewords for each pixel. We also indicate how the training models are used for the discrimination of target objects in surveillance video. Our experimental results are presented in terms of the success rate and the segmenting precision.展开更多
Innovations on the Internet of Everything(IoE)enabled systems are driving a change in the settings where we interact in smart units,recognized globally as smart city environments.However,intelligent video-surveillance...Innovations on the Internet of Everything(IoE)enabled systems are driving a change in the settings where we interact in smart units,recognized globally as smart city environments.However,intelligent video-surveillance systems are critical to increasing the security of these smart cities.More precisely,in today’s world of smart video surveillance,person re-identification(Re-ID)has gained increased consideration by researchers.Various researchers have designed deep learningbased algorithms for person Re-ID because they have achieved substantial breakthroughs in computer vision problems.In this line of research,we designed an adaptive feature refinementbased deep learning architecture to conduct person Re-ID.In the proposed architecture,the inter-channel and inter-spatial relationship of features between the images of the same individual taken from nonidentical camera viewpoints are focused on learning spatial and channel attention.In addition,the spatial pyramid pooling layer is inserted to extract the multiscale and fixed-dimension feature vectors irrespective of the size of the feature maps.Furthermore,the model’s effectiveness is validated on the CUHK01 and CUHK02 datasets.When compared with existing approaches,the approach presented in this paper achieves encouraging Rank 1 and 5 scores of 24.6% and 54.8%,respectively.展开更多
Automatic object classification in traffic scene videos is an important issue for intelligent visual surveillance with great potential for all kinds of security applications. However, this problem is very challenging ...Automatic object classification in traffic scene videos is an important issue for intelligent visual surveillance with great potential for all kinds of security applications. However, this problem is very challenging for the following reasons. Firstly, regions of interest in videos are of low res- olution and limited size due to the capacity of conventional surveillance cameras. Secondly, the intra-class variations are very large due to changes of view angles, lighting conditions, and environments. Thirdly, real-time performance of algo- rithms is always required for real applications. In this paper, we evaluate the performance of local feature descriptors for automatic object classification in traffic scenes. Image inten- sity or gradient information is directly used to construct ef- fective feature vectors from regions of interest extracted via motion detection. This strategy has great advantages of ef- ficiency compared to various complicated texture features. We not only analyze and evaluate the performance of differ- ent feature descriptors, but also fuse different scales and fea- tures to achieve better performance. Numerous experiments are conducted and experimental results demonstrate the ef- ficiency and effectiveness of this strategy with robustness to noise, variance of view angles, lighting conditions, and environments.展开更多
This paper considers the problem of long-term target tracking in complex scenes when tracking failures are unavoidable due to illumination change,target deformation,scale change,motion blur,and other factors.More spec...This paper considers the problem of long-term target tracking in complex scenes when tracking failures are unavoidable due to illumination change,target deformation,scale change,motion blur,and other factors.More specifically,a target tracking algorithm,called re-detection multi-feature fusion,is proposed based on the fusion of scale-adaptive kernel correlation filtering and re-detection.The target tracking algorithm trains three kernel correlation filters based on the histogram of oriented gradients,colour name,and local binary pattern features and then obtains the fusion weight of response graphs corresponding to different features based on average peak correlation energy criterion and uses weighted average to complete the position estimation of the tracked target.In order to deal with the problem that the target is occluded and disappears in the tracking process,a random fern classifier is trained to perform re-detection when the target is occluded.After comparing the OTB-50 target tracking dataset,the experimental results show that the proposed tracker can track the target well in the occlusion attribute video sequence in the OTB-100 test dataset and has a certain improvement in tracking accuracy and success rate compared with the traditional correlation filter tracker.展开更多
基金supported by the National Natural Science Foundation of China(60835004 60775047+2 种基金 60872130)the National High Technology Research and Development Program of China(863 Program)(2007AA04Z244 2008AA04Z214)
文摘An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method using the object models is proposed to track multiple objects in a real-time visual surveillance system. Firstly, for detecting objects, an adaptive kernel density estimation method is utilized, which uses an adaptive bandwidth and features combining colour and gradient. Secondly, some models of objects are built for describing motion, shape and colour features. Then, a matching matrix is formed to analyze tracking situations. If objects are tracked under occlusions, the optimal "visual" object is found to represent the occluded object, and the posterior probability of pixel is used to determine which pixel is utilized for updating object models. Extensive experiments show that this method improves the accuracy and validity of tracking objects even under occlusions and is used in real-time visual surveillance systems.
基金Supported by the National Natural Science Founda-tion of China (No.60472036)the Beijing Natural Science Foundation (No.4052007)the Beijing Novel Program (No.2005B08).
文摘Construction of high resolution images from low resolution sequences is often im- portant in surveillance applications. In this letter, an affine based multi-scale block-matching image registration algorithm is first proposed. The images to be registered are divided into overlapped blocks of different size according to its motions. The Least Square (LS) image reg- istration algorithm is extended to match the blocks. Then an object based Super Resolution (SR) scheme is designed, the Maximum A Priori (MAP) super resolution algorithm is extended to enhance the resolution of the interest objects. Experimental results show that the proposed multi-scale registration method provides more accurate registration between frames. Further more, the object based super resolution scheme shows an enhanced performance compared with the traditional MAP method.
基金National High-Tech Research and Development Plan of China(No.2003AA1Z2130)Science and Technology Project of Zhejiang Province of China(No.2005C1100102)
文摘A frequent trajectory patterns mining algorithm is proposed to learn the object activities and classify the trajectories in intelligent visual surveillance system.The distribution patterns of the trajectories were generated by an Apriori based frequent patterns mining algorithm and the trajectories were classified by the frequent trajectory patterns generated.In addition,a fuzzy c-means(FCM)based learning algorithm and a mean shift based clustering procedure were used to construct the representation of trajectories.The algorithm can be further used to describe activities and identify anomalies.The experiments on two real scenes show that the algorithm is effective.
基金supported by grants PID2022-142946NA-I00 and PID2022-141159OB-I00funded by MICIU/AEI/10.13039/501100011033ERDF/EU
文摘In the context of multiple-target tracking and surveillance applications,this paper investigates the challenge of determining the optimal positioning of a single autonomous aerial vehicle or agent equipped with multiple independently-steerable zooming cameras to effectively monitor a set of targets of interest.Each camera is dedicated to tracking a specific target or cluster of targets.The key innovation of this study,in comparison to existing approaches,lies in incorporating the zooming factor for the onboard cameras into the optimization problem.This enhancement offers greater flexibility during mission execution by allowing the autonomous agent to adjust the focal lengths of the onboard cameras,in exchange for varying real-world distances to the corresponding targets,thereby providing additional degrees of freedom to the optimization problem.The proposed optimization framework aims to strike a balance among various factors,including distance to the targets,verticality of viewpoints,and the required focal length for each camera.The primary focus of this paper is to establish the theoretical groundwork for addressing the non-convex nature of the optimization problem arising from these considerations.To this end,we develop an original convex approximation strategy.The paper also includes simulations of diverse scenarios,featuring varying numbers of onboard tracking cameras and target motion profiles,to validate the effectiveness of the proposed approach.
基金supported by the National Natural Science Foundation of China under Grant No. 61175007the National Key Technologies R&D Program under Grant No. 2012BAH07B01the National Key Basic Research Program of China (973 Program) under Grant No. 2012CB316302
文摘Crowd density estimation in wide areas is a challenging problem for visual surveillance. Because of the high risk of degeneration, the safety of public events involving large crowds has always been a major concern. In this paper, we propose a video-based crowd density analysis and prediction system for wide-area surveillance applications. In monocular image sequences, the Accumulated Mosaic Image Difference (AMID) method is applied to extract crowd areas having irregular motion. The specific number of persons and velocity of a crowd can be adequately estimated by our system from the density of crowded areas. Using a multi-camera network, we can obtain predictions of a crowd's density several minutes in advance. The system has been used in real applications, and numerous experiments conducted in real scenes (station, park, plaza) demonstrate the effectiveness and robustness of the proposed method.
基金the National Natural Science Foundation of China(No90304001,60472036)the Beijing Natural Science Foundation(4052007)+1 种基金the National Key Lab of Communication Foundation,UEST,China(51434050105QT0101) the PolyU/UGC grants(B-Q698)
文摘Construction of high resolution images from low resolution sequences having rigid or semi-rigid ob-jects with unified motions is often important in surveillance and other applications.In this paper a novelobject-based super resolution reconstruction scheme was proposed,in which a six-parameter affine model-based object tracking and registration method was first used to segment and match objects among a se-quence of low resolution frames.The motion model was then further extended to the traditional maximuma posterior(MAP)super resolution algorithm.The proposed object tracking and registration method wasevaluated by both simulated and real acquired sequences.The results have demonstrated the high accura-cy of the proposed object based method and the enhanced reconstruction performance of the extended ap-proach.
基金Project supported by the National Natural Science Foundation of China (No. 60272031), and Technology Plan Program of ZhejiangProvince (No. 2003C21010), and Zhejiang Provincial Natural Sci-ence Foundation of China (No. M603202)
文摘Detecting objects of interest from a video sequence is a fundamental and critical task in automated visual surveillance. Most current approaches only focus on discriminating moving objects by background subtraction whether or not the objects of interest can be moving or stationary. In this paper, we propose layers segmentation to detect both moving and stationary target objects from surveillance video. We extend the Maximum Entropy (ME) statistical model to segment layers with features, which are collected by constructing a codebook with a set of codewords for each pixel. We also indicate how the training models are used for the discrimination of target objects in surveillance video. Our experimental results are presented in terms of the success rate and the segmenting precision.
基金supported by Korea Institute for Advancement of Technology(KIAT)grant funded by the Korea Government(MOTIE)(P0008703,The Competency Development Program for Industry Specialist)the MSIT(Ministry of Science and ICT),Republic of Korea,under the ITRC(Information Technology Research Center)support program(IITP-2022-2018-0-01799)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation).
文摘Innovations on the Internet of Everything(IoE)enabled systems are driving a change in the settings where we interact in smart units,recognized globally as smart city environments.However,intelligent video-surveillance systems are critical to increasing the security of these smart cities.More precisely,in today’s world of smart video surveillance,person re-identification(Re-ID)has gained increased consideration by researchers.Various researchers have designed deep learningbased algorithms for person Re-ID because they have achieved substantial breakthroughs in computer vision problems.In this line of research,we designed an adaptive feature refinementbased deep learning architecture to conduct person Re-ID.In the proposed architecture,the inter-channel and inter-spatial relationship of features between the images of the same individual taken from nonidentical camera viewpoints are focused on learning spatial and channel attention.In addition,the spatial pyramid pooling layer is inserted to extract the multiscale and fixed-dimension feature vectors irrespective of the size of the feature maps.Furthermore,the model’s effectiveness is validated on the CUHK01 and CUHK02 datasets.When compared with existing approaches,the approach presented in this paper achieves encouraging Rank 1 and 5 scores of 24.6% and 54.8%,respectively.
文摘Automatic object classification in traffic scene videos is an important issue for intelligent visual surveillance with great potential for all kinds of security applications. However, this problem is very challenging for the following reasons. Firstly, regions of interest in videos are of low res- olution and limited size due to the capacity of conventional surveillance cameras. Secondly, the intra-class variations are very large due to changes of view angles, lighting conditions, and environments. Thirdly, real-time performance of algo- rithms is always required for real applications. In this paper, we evaluate the performance of local feature descriptors for automatic object classification in traffic scenes. Image inten- sity or gradient information is directly used to construct ef- fective feature vectors from regions of interest extracted via motion detection. This strategy has great advantages of ef- ficiency compared to various complicated texture features. We not only analyze and evaluate the performance of differ- ent feature descriptors, but also fuse different scales and fea- tures to achieve better performance. Numerous experiments are conducted and experimental results demonstrate the ef- ficiency and effectiveness of this strategy with robustness to noise, variance of view angles, lighting conditions, and environments.
基金International Cooperation and Exchange Program of Shaanxi Province,Grant/Award Number:2022KW‐04Natural Science Foundation of Shaanxi Province,Grant/Award Number:2018JM6120+1 种基金Xi'an Science and Technology Plan Project,Grant/Award Number:21XJZZ0072Major Science and Technology Projects of Xian Yang City,Grant/Award Number:2017k01‐25‐12。
文摘This paper considers the problem of long-term target tracking in complex scenes when tracking failures are unavoidable due to illumination change,target deformation,scale change,motion blur,and other factors.More specifically,a target tracking algorithm,called re-detection multi-feature fusion,is proposed based on the fusion of scale-adaptive kernel correlation filtering and re-detection.The target tracking algorithm trains three kernel correlation filters based on the histogram of oriented gradients,colour name,and local binary pattern features and then obtains the fusion weight of response graphs corresponding to different features based on average peak correlation energy criterion and uses weighted average to complete the position estimation of the tracked target.In order to deal with the problem that the target is occluded and disappears in the tracking process,a random fern classifier is trained to perform re-detection when the target is occluded.After comparing the OTB-50 target tracking dataset,the experimental results show that the proposed tracker can track the target well in the occlusion attribute video sequence in the OTB-100 test dataset and has a certain improvement in tracking accuracy and success rate compared with the traditional correlation filter tracker.