Indoor multi-tracking is more challenging compared with outdoor tasks due to frequent occlusion, view-truncation, severe scale change and pose variation, which may bring considerable unreliability and ambiguity to tar...Indoor multi-tracking is more challenging compared with outdoor tasks due to frequent occlusion, view-truncation, severe scale change and pose variation, which may bring considerable unreliability and ambiguity to target representation and data association. So discriminative and reliable target representation is vital for accurate data association in multi-tracking. Pervious works always combine bunch of features to increase the discriminative power, but this is prone to error accumulation and unnecessary computational cost, which may increase ambiguity on the contrary. Moreover, reliability of a same feature in different scenes may vary a lot, especially for currently widespread network cameras, which are settled in various and complex indoor scenes, previous fixed feature selection schemes cannot meet general requirements. To properly handle these problems, first, we propose a scene-adaptive hierarchical data association scheme, which adaptively selects features with higher reliability on target representation in the applied scene, and gradually combines features to the minimum requirement of discriminating ambiguous targets; second, a novel depth-invariant part-based appearance model using RGB-D data is proposed which makes the appearance model robust to scale change, partial occlusion and view-truncation. The introduce of RGB-D data increases the diversity of features, which provides more types of features for feature selection in data association and enhances the final multi-tracking performance. We validate our method from several aspects including scene-adaptive feature selection scheme, hierarchical data association scheme and RGB-D based appearance modeling scheme in various indoor scenes, which demonstrates its effectiveness and efficiency on improving multi-tracking performances in various indoor scenes.展开更多
An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method...An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method using the object models is proposed to track multiple objects in a real-time visual surveillance system. Firstly, for detecting objects, an adaptive kernel density estimation method is utilized, which uses an adaptive bandwidth and features combining colour and gradient. Secondly, some models of objects are built for describing motion, shape and colour features. Then, a matching matrix is formed to analyze tracking situations. If objects are tracked under occlusions, the optimal "visual" object is found to represent the occluded object, and the posterior probability of pixel is used to determine which pixel is utilized for updating object models. Extensive experiments show that this method improves the accuracy and validity of tracking objects even under occlusions and is used in real-time visual surveillance systems.展开更多
MGAC (Motion Geometric Active Contours), a new variational framework of geometric active contours to track multiple nonrigid moving objects in the clutter background in image sequences is presented. This framework, in...MGAC (Motion Geometric Active Contours), a new variational framework of geometric active contours to track multiple nonrigid moving objects in the clutter background in image sequences is presented. This framework, incorporating with the motion edge information, consists of motion detection and tracking stages. At the motion detection stage, the motion edge map provides an approximate edge map of the moving objects. Then, a tracking stage, merely using the static edge information, is considered to improve the motion detection result. Force field regularization method is used to extend the capture range of the edge attraction force field in both stages. Experiments demonstrate that the proposed framework is valid for tracking multiple nonrigid objects in the clutter background.展开更多
Significant advancements have beenwitnessed in visual tracking applications leveragingViT in recent years,mainly due to the formidablemodeling capabilities of Vision Transformer(ViT).However,the strong performance of ...Significant advancements have beenwitnessed in visual tracking applications leveragingViT in recent years,mainly due to the formidablemodeling capabilities of Vision Transformer(ViT).However,the strong performance of such trackers heavily relies on ViT models pretrained for long periods,limitingmore flexible model designs for tracking tasks.To address this issue,we propose an efficient unsupervised ViT pretraining method for the tracking task based on masked autoencoders,called TrackMAE.During pretraining,we employ two shared-parameter ViTs,serving as the appearance encoder and motion encoder,respectively.The appearance encoder encodes randomly masked image data,while the motion encoder encodes randomly masked pairs of video frames.Subsequently,an appearance decoder and a motion decoder separately reconstruct the original image data and video frame data at the pixel level.In this way,ViT learns to understand both the appearance of images and the motion between video frames simultaneously.Experimental results demonstrate that ViT-Base and ViT-Large models,pretrained with TrackMAE and combined with a simple tracking head,achieve state-of-the-art(SOTA)performance without additional design.Moreover,compared to the currently popular MAE pretraining methods,TrackMAE consumes only 1/5 of the training time,which will facilitate the customization of diverse models for tracking.For instance,we additionally customize a lightweight ViT-XS,which achieves SOTA efficient tracking performance.展开更多
A literature analysis has shown that object search,recognition,and tracking systems are becoming increasingly popular.However,such systems do not achieve high practical results in analyzing small moving living objects...A literature analysis has shown that object search,recognition,and tracking systems are becoming increasingly popular.However,such systems do not achieve high practical results in analyzing small moving living objects ranging from 8 to 14 mm.This article examines methods and tools for recognizing and tracking the class of small moving objects,such as ants.To fulfill those aims,a customized You Only Look Once Ants Recognition(YOLO_AR)Convolutional Neural Network(CNN)has been trained to recognize Messor Structor ants in the laboratory using the LabelImg object marker tool.The proposed model is an extension of the You Only Look Once v4(Yolov4)512×512 model with an additional Self Regularized Non–Monotonic(Mish)activation function.Additionally,the scalable solution for continuous object recognizing and tracking was implemented.This solution is based on the OpenDatacam system,with extended Object Tracking modules that allow for tracking and counting objects that have crossed the custom boundary line.During the study,the methods of the alignment algorithm for finding the trajectory of moving objects were modified.I discovered that the Hungarian algorithm showed better results in tracking small objects than the K–D dimensional tree(k-d tree)matching algorithm used in OpenDataCam.Remarkably,such an algorithm showed better results with the implemented YOLO_AR model due to the lack of False Positives(FP).Therefore,I provided a new tracker module with a Hungarian matching algorithm verified on the Multiple Object Tracking(MOT)benchmark.Furthermore,additional customization parameters for object recognition and tracking results parsing and filtering were added,like boundary angle threshold(BAT)and past frames trajectory prediction(PFTP).Experimental tests confirmed the results of the study on a mobile device.During the experiment,parameters such as the quality of recognition and tracking of moving objects,the PFTP and BAT,and the configuration parameters of the neural network and boundary line model were analyzed.The results showed an increased tracking accuracy with the proposed methods by 50%.The study results confirmed the relevance of the topic and the effectiveness of the implemented methods and tools.展开更多
The amount of needed control messages in wireless sensor networks(WSN)is affected by the storage strategy of detected events.Because broadcasting superfluous control messages consumes excess energy,the network lifespa...The amount of needed control messages in wireless sensor networks(WSN)is affected by the storage strategy of detected events.Because broadcasting superfluous control messages consumes excess energy,the network lifespan can be extended if the quantity of control messages is decreased.In this study,an optimized storage technique having low control overhead for tracking the objects in WSN is introduced.The basic concept is to retain observed events in internal memory and preserve the relationship between sensed information and sensor nodes using a novel inexpensive data structure entitled Ordered Binary Linked List(OBLL).Whenever an object passes over the sensor area,the recognizing sensor can immediately produce an OBLL along the object’s route.To retrieve the entire information,the OBLL can be traversed with logarithmic complexity which is much less than the traversing complexity of existing linked list structures.Performance evaluation and simulations were carried out to ensure that the suggested technique minimizes the number of messages and thus saving energy and extending the network life.展开更多
Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have becom...Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.展开更多
Visual object-tracking is a fundamental task applied in many applications of computer vision. Particle filter is one of the techniques which has been widely used in object tracking. Due to the virtue of extendability ...Visual object-tracking is a fundamental task applied in many applications of computer vision. Particle filter is one of the techniques which has been widely used in object tracking. Due to the virtue of extendability and flexibility on both linear and non-linear environments, various particle filter-based trackers have been proposed in the literature. However, the conventional approach cannot handle very large videos efficiently in the current data intensive information age. In this work, a parallelized particle filter is provided in a distributed framework provided by the Hadoop/Map-Reduce infrastructure to tackle object-tracking tasks. The experiments indicate that the proposed algorithm has a better convergence and accuracy as compared to the traditional particle filter. The computational power and the scalability of the proposed particle filter in single object tracking have been enhanced as well.展开更多
An improved estimation of motion vectors of feature points is proposed for tracking moving objects of dynamic image sequence. Feature points are firstly extracted by the improved minimum intensity change (MIC) algor...An improved estimation of motion vectors of feature points is proposed for tracking moving objects of dynamic image sequence. Feature points are firstly extracted by the improved minimum intensity change (MIC) algorithm. The matching points of these feature points are then determined by adaptive rood pattern searching. Based on the random sample consensus (RANSAC) method, the background motion is finally compensated by the parameters of an affine transform of the background motion. With reasonable morphological filtering, the moving objects are completely extracted from the background, and then tracked accurately. Experimental results show that the improved method is successful on the motion background compensation and offers great promise in tracking moving objects of the dynamic image sequence.展开更多
In practical application,mean shift tracking algorithm is easy to generate tracking drift when the target and the background have similar color distribution.Based on the mean shift algorithm,a kind of background weake...In practical application,mean shift tracking algorithm is easy to generate tracking drift when the target and the background have similar color distribution.Based on the mean shift algorithm,a kind of background weaken weight is proposed in the paper firstly.Combining with the object center weight based on the kernel function,the problem of interference of the similar color background can be solved.And then,a model updating strategy is presented to improve the tracking robustness on the influence of occlusion,illumination,deformation and so on.With the test on the sequence of Tiger,the proposed approach provides better performance than the original mean shift tracking algorithm.展开更多
There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most ...There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most challenging problems.In this paper,we present a visual object tracking and servoing control system utilizing a tailor-made 38 g nano-scale quadrotor.A lightweight visual module is integrated to enable object tracking capabilities,and a micro positioning deck is mounted to provide accurate pose estimation.In order to be robust against object appearance variations,a novel object tracking algorithm,denoted by RMCTer,is proposed,which integrates a powerful short-term tracking module and an efficient long-term processing module.In particular,the long-term processing module can provide additional object information and modify the short-term tracking model in a timely manner.Furthermore,a positionbased visual servoing control method is proposed for the quadrotor,where an adaptive tracking controller is designed by leveraging backstepping and adaptive techniques.Stable and accurate object tracking is achieved even under disturbances.Experimental results are presented to demonstrate the high accuracy and stability of the whole tracking system.展开更多
Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It a...Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It allows the deployment of smart cameras or optical sensors with computer vision techniques,which may serve in several object detection and tracking tasks.These tasks have been considered challenging and high-level perceptual problems,frequently dominated by relative information about the environment,where main concerns such as occlusion,illumination,background,object deformation,and object class variations are commonplace.In order to show the importance of top view surveillance,a collaborative robotics framework has been presented.It can assist in the detection and tracking of multiple objects in top view surveillance.The framework consists of a smart robotic camera embedded with the visual processing unit.The existing pre-trained deep learning models named SSD and YOLO has been adopted for object detection and localization.The detection models are further combined with different tracking algorithms,including GOTURN,MEDIANFLOW,TLD,KCF,MIL,and BOOSTING.These algorithms,along with detection models,help to track and predict the trajectories of detected objects.The pre-trained models are employed;therefore,the generalization performance is also investigated through testing the models on various sequences of top view data set.The detection models achieved maximum True Detection Rate 93%to 90%with a maximum 0.6%False Detection Rate.The tracking results of different algorithms are nearly identical,with tracking accuracy ranging from 90%to 94%.Furthermore,a discussion has been carried out on output results along with future guidelines.展开更多
This paper describes a new framework for object detection and tracking of AUV including underwater acoustic data interpolation, underwater acoustic images segmentation and underwater objects tracking. This framework i...This paper describes a new framework for object detection and tracking of AUV including underwater acoustic data interpolation, underwater acoustic images segmentation and underwater objects tracking. This framework is applied to the design of vision-based method for AUV based on the forward looking sonar sensor. First, the real-time data flow (underwater acoustic images) is pre-processed to form the whole underwater acoustic image, and the relevant position information of objects is extracted and determined. An improved method of double threshold segmentation is proposed to resolve the problem that the threshold cannot be adjusted adaptively in the traditional method. Second, a representation of region information is created in light of the Gaussian particle filter. The weighted integration strategy combining the area and invariant moment is proposed to perfect the weight of particles and to enhance the tracking robustness. Results obtained on the real acoustic vision platform of AUV during sea trials are displayed and discussed. They show that the proposed method can detect and track the moving objects underwater online, and it is effective and robust.展开更多
Inspired by human behaviors, a robot object tracking model is proposed on the basis of visual attention mechanism, which is fit for the theory of topological perception. The model integrates the image-driven, bottom-u...Inspired by human behaviors, a robot object tracking model is proposed on the basis of visual attention mechanism, which is fit for the theory of topological perception. The model integrates the image-driven, bottom-up attention and the object-driven, top-down attention, whereas the previous attention model has mostly focused on either the bottom-up or top-down attention. By the bottom-up component, the whole scene is segmented into the ground region and the salient regions. Guided by top-down strategy which is achieved by a topological graph, the object regions are separated from the salient regions. The salient regions except the object regions are the barrier regions. In order to estimate the model, a mobile robot platform is developed, on which some experiments are implemented. The experimental results indicate that processing an image with a resolution of 752 × 480 pixels takes less than 200 ms and the object regions are unabridged. The analysis obtained by comparing the proposed model with the existing model demonstrates that the proposed model has some advantages in robot object tracking in terms of speed and efficiency.展开更多
A method for moving object recognition and tracking in the intelligent traffic monitoring system is presented. For the shortcomings and deficiencies of the frame-subtraction method, a redundant discrete wavelet transf...A method for moving object recognition and tracking in the intelligent traffic monitoring system is presented. For the shortcomings and deficiencies of the frame-subtraction method, a redundant discrete wavelet transform (RDWT) based moving object recognition algorithm is put forward, which directly detects moving objects in the redundant discrete wavelet transform domain. An improved adaptive mean-shift algorithm is used to track the moving object in the follow up frames. Experimental results show that the algorithm can effectively extract the moving object, even though the object is similar to the background, and the results are better than the traditional frame-subtraction method. The object tracking is accurate without the impact of changes in the size of the object. Therefore the algorithm has a certain practical value and prospect.展开更多
In this paper,we provide a new approach for intelligent traffic transportation in the intelligent vehicular networks,which aims at collecting the vehicles’locations,trajectories and other key driving parameters for t...In this paper,we provide a new approach for intelligent traffic transportation in the intelligent vehicular networks,which aims at collecting the vehicles’locations,trajectories and other key driving parameters for the time-critical autonomous driving’s requirement.The key of our method is a multi-vehicle tracking framework in the traffic monitoring scenario..Our proposed framework is composed of three modules:multi-vehicle detection,multi-vehicle association and miss-detected vehicle tracking.For the first module,we integrate self-attention mechanism into detector of using key point estimation for better detection effect.For the second module,we apply the multi-dimensional information for robustness promotion,including vehicle re-identification(Re-ID)features,historical trajectory information,and spatial position information For the third module,we re-track the miss-detected vehicles with occlusions in the first detection module.Besides,we utilize the asymmetric convolution and depth-wise separable convolution to reduce the model’s parameters for speed-up.Extensive experimental results show the effectiveness of our proposed multi-vehicle tracking framework.展开更多
In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is es...In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.展开更多
Unmanned aerial vehicles(UAVs)can be used to monitor traffic in a variety of settings,including security,traffic surveillance,and traffic control.Numerous academics have been drawn to this topic because of the challen...Unmanned aerial vehicles(UAVs)can be used to monitor traffic in a variety of settings,including security,traffic surveillance,and traffic control.Numerous academics have been drawn to this topic because of the challenges and the large variety of applications.This paper proposes a new and efficient vehicle detection and tracking system that is based on road extraction and identifying objects on it.It is inspired by existing detection systems that comprise stationary data collectors such as induction loops and stationary cameras that have a limited field of view and are not mobile.The goal of this study is to develop a method that first extracts the region of interest(ROI),then finds and tracks the items of interest.The suggested system is divided into six stages.The photos from the obtained dataset are appropriately georeferenced to their actual locations in the first phase,after which they are all co-registered.The ROI,or road and its objects,are retrieved using the GrabCut method in the second phase.The third phase entails data preparation.The segmented images’noise is eliminated using Gaussian blur,after which the images are changed to grayscale and forwarded to the following stage for additional morphological procedures.The YOLOv3 algorithm is used in the fourth step to find any automobiles in the photos.Following that,the Kalman filter and centroid tracking are used to perform the tracking of the detected cars.The Lucas-Kanade method is then used to perform the trajectory analysis on the vehicles.The suggested model is put to the test and assessed using the Vehicle Aerial Imaging from Drone(VAID)dataset.For detection and tracking,the model was able to attain accuracy levels of 96.7%and 91.6%,respectively.展开更多
Single object tracking based on deep learning has achieved the advanced performance in many applications of computer vision.However,the existing trackers have certain limitations owing to deformation,occlusion,movemen...Single object tracking based on deep learning has achieved the advanced performance in many applications of computer vision.However,the existing trackers have certain limitations owing to deformation,occlusion,movement and some other conditions.We propose a siamese attentional dense network called SiamADN in an end-to-end offline manner,especially aiming at unmanned aerial vehicle(UAV)tracking.First,it applies a dense network to reduce vanishing-gradient,which strengthens the features transfer.Second,the channel attention mechanism is involved into the Densenet structure,in order to focus on the possible key regions.The advance corner detection network is introduced to improve the following tracking process.Extensive experiments are carried out on four mainly tracking benchmarks as OTB-2015,UAV123,LaSOT and VOT.The accuracy rate on UAV123 is 78.9%,and the running speed is 32 frame per second(FPS),which demonstrates its efficiency in the practical real application.展开更多
Video object tracking is an important research topic of computer vision, whichfinds a wide range of applications in video surveillance, robotics, human-computerinteraction and so on. Although many moving object tracki...Video object tracking is an important research topic of computer vision, whichfinds a wide range of applications in video surveillance, robotics, human-computerinteraction and so on. Although many moving object tracking algorithms have beenproposed, there are still many difficulties in the actual tracking process, such asillumination change, occlusion, motion blurring, scale change, self-change and so on.Therefore, the development of object tracking technology is still challenging. Theemergence of deep learning theory and method provides a new opportunity for theresearch of object tracking, and it is also the main theoretical framework for the researchof moving object tracking algorithm in this paper. In this paper, the existing deeptracking-based target tracking algorithms are classified and sorted out. Based on theprevious knowledge and my own understanding, several solutions are proposed for theexisting methods. In addition, the existing deep learning target tracking method is stilldifficult to meet the requirements of real-time, how to design the network and trackingprocess to achieve speed and effect improvement, there is still a lot of research space.展开更多
基金This work is supported by National Natural Science Foundation of China (NSFC, No. 61340046), National High Technology Research and Development Program of China (863 Program, No. 2006AA04Z247), Scientific and Technical Innovation Commission of Shenzhen Municipality (JCYJ20130331144631730, JCYJ20130331144716089), Specialized Research Fund for the Doctoral Program of Higher Education (No. 20130001110011).
文摘Indoor multi-tracking is more challenging compared with outdoor tasks due to frequent occlusion, view-truncation, severe scale change and pose variation, which may bring considerable unreliability and ambiguity to target representation and data association. So discriminative and reliable target representation is vital for accurate data association in multi-tracking. Pervious works always combine bunch of features to increase the discriminative power, but this is prone to error accumulation and unnecessary computational cost, which may increase ambiguity on the contrary. Moreover, reliability of a same feature in different scenes may vary a lot, especially for currently widespread network cameras, which are settled in various and complex indoor scenes, previous fixed feature selection schemes cannot meet general requirements. To properly handle these problems, first, we propose a scene-adaptive hierarchical data association scheme, which adaptively selects features with higher reliability on target representation in the applied scene, and gradually combines features to the minimum requirement of discriminating ambiguous targets; second, a novel depth-invariant part-based appearance model using RGB-D data is proposed which makes the appearance model robust to scale change, partial occlusion and view-truncation. The introduce of RGB-D data increases the diversity of features, which provides more types of features for feature selection in data association and enhances the final multi-tracking performance. We validate our method from several aspects including scene-adaptive feature selection scheme, hierarchical data association scheme and RGB-D based appearance modeling scheme in various indoor scenes, which demonstrates its effectiveness and efficiency on improving multi-tracking performances in various indoor scenes.
基金supported by the National Natural Science Foundation of China(60835004 60775047+2 种基金 60872130)the National High Technology Research and Development Program of China(863 Program)(2007AA04Z244 2008AA04Z214)
文摘An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method using the object models is proposed to track multiple objects in a real-time visual surveillance system. Firstly, for detecting objects, an adaptive kernel density estimation method is utilized, which uses an adaptive bandwidth and features combining colour and gradient. Secondly, some models of objects are built for describing motion, shape and colour features. Then, a matching matrix is formed to analyze tracking situations. If objects are tracked under occlusions, the optimal "visual" object is found to represent the occluded object, and the posterior probability of pixel is used to determine which pixel is utilized for updating object models. Extensive experiments show that this method improves the accuracy and validity of tracking objects even under occlusions and is used in real-time visual surveillance systems.
文摘MGAC (Motion Geometric Active Contours), a new variational framework of geometric active contours to track multiple nonrigid moving objects in the clutter background in image sequences is presented. This framework, incorporating with the motion edge information, consists of motion detection and tracking stages. At the motion detection stage, the motion edge map provides an approximate edge map of the moving objects. Then, a tracking stage, merely using the static edge information, is considered to improve the motion detection result. Force field regularization method is used to extend the capture range of the edge attraction force field in both stages. Experiments demonstrate that the proposed framework is valid for tracking multiple nonrigid objects in the clutter background.
基金supported in part by National Natural Science Foundation of China(No.62176041)in part by Excellent Science and Technique Talent Foundation of Dalian(No.2022RY21).
文摘Significant advancements have beenwitnessed in visual tracking applications leveragingViT in recent years,mainly due to the formidablemodeling capabilities of Vision Transformer(ViT).However,the strong performance of such trackers heavily relies on ViT models pretrained for long periods,limitingmore flexible model designs for tracking tasks.To address this issue,we propose an efficient unsupervised ViT pretraining method for the tracking task based on masked autoencoders,called TrackMAE.During pretraining,we employ two shared-parameter ViTs,serving as the appearance encoder and motion encoder,respectively.The appearance encoder encodes randomly masked image data,while the motion encoder encodes randomly masked pairs of video frames.Subsequently,an appearance decoder and a motion decoder separately reconstruct the original image data and video frame data at the pixel level.In this way,ViT learns to understand both the appearance of images and the motion between video frames simultaneously.Experimental results demonstrate that ViT-Base and ViT-Large models,pretrained with TrackMAE and combined with a simple tracking head,achieve state-of-the-art(SOTA)performance without additional design.Moreover,compared to the currently popular MAE pretraining methods,TrackMAE consumes only 1/5 of the training time,which will facilitate the customization of diverse models for tracking.For instance,we additionally customize a lightweight ViT-XS,which achieves SOTA efficient tracking performance.
文摘A literature analysis has shown that object search,recognition,and tracking systems are becoming increasingly popular.However,such systems do not achieve high practical results in analyzing small moving living objects ranging from 8 to 14 mm.This article examines methods and tools for recognizing and tracking the class of small moving objects,such as ants.To fulfill those aims,a customized You Only Look Once Ants Recognition(YOLO_AR)Convolutional Neural Network(CNN)has been trained to recognize Messor Structor ants in the laboratory using the LabelImg object marker tool.The proposed model is an extension of the You Only Look Once v4(Yolov4)512×512 model with an additional Self Regularized Non–Monotonic(Mish)activation function.Additionally,the scalable solution for continuous object recognizing and tracking was implemented.This solution is based on the OpenDatacam system,with extended Object Tracking modules that allow for tracking and counting objects that have crossed the custom boundary line.During the study,the methods of the alignment algorithm for finding the trajectory of moving objects were modified.I discovered that the Hungarian algorithm showed better results in tracking small objects than the K–D dimensional tree(k-d tree)matching algorithm used in OpenDataCam.Remarkably,such an algorithm showed better results with the implemented YOLO_AR model due to the lack of False Positives(FP).Therefore,I provided a new tracker module with a Hungarian matching algorithm verified on the Multiple Object Tracking(MOT)benchmark.Furthermore,additional customization parameters for object recognition and tracking results parsing and filtering were added,like boundary angle threshold(BAT)and past frames trajectory prediction(PFTP).Experimental tests confirmed the results of the study on a mobile device.During the experiment,parameters such as the quality of recognition and tracking of moving objects,the PFTP and BAT,and the configuration parameters of the neural network and boundary line model were analyzed.The results showed an increased tracking accuracy with the proposed methods by 50%.The study results confirmed the relevance of the topic and the effectiveness of the implemented methods and tools.
文摘The amount of needed control messages in wireless sensor networks(WSN)is affected by the storage strategy of detected events.Because broadcasting superfluous control messages consumes excess energy,the network lifespan can be extended if the quantity of control messages is decreased.In this study,an optimized storage technique having low control overhead for tracking the objects in WSN is introduced.The basic concept is to retain observed events in internal memory and preserve the relationship between sensed information and sensor nodes using a novel inexpensive data structure entitled Ordered Binary Linked List(OBLL).Whenever an object passes over the sensor area,the recognizing sensor can immediately produce an OBLL along the object’s route.To retrieve the entire information,the OBLL can be traversed with logarithmic complexity which is much less than the traversing complexity of existing linked list structures.Performance evaluation and simulations were carried out to ensure that the suggested technique minimizes the number of messages and thus saving energy and extending the network life.
基金supported by the National Natural Science Foundation of China under Grant 62177029the Postgraduate Research&Practice Innovation Program of Jiangsu Province(KYCX21_0740),China.
文摘Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.
文摘Visual object-tracking is a fundamental task applied in many applications of computer vision. Particle filter is one of the techniques which has been widely used in object tracking. Due to the virtue of extendability and flexibility on both linear and non-linear environments, various particle filter-based trackers have been proposed in the literature. However, the conventional approach cannot handle very large videos efficiently in the current data intensive information age. In this work, a parallelized particle filter is provided in a distributed framework provided by the Hadoop/Map-Reduce infrastructure to tackle object-tracking tasks. The experiments indicate that the proposed algorithm has a better convergence and accuracy as compared to the traditional particle filter. The computational power and the scalability of the proposed particle filter in single object tracking have been enhanced as well.
文摘An improved estimation of motion vectors of feature points is proposed for tracking moving objects of dynamic image sequence. Feature points are firstly extracted by the improved minimum intensity change (MIC) algorithm. The matching points of these feature points are then determined by adaptive rood pattern searching. Based on the random sample consensus (RANSAC) method, the background motion is finally compensated by the parameters of an affine transform of the background motion. With reasonable morphological filtering, the moving objects are completely extracted from the background, and then tracked accurately. Experimental results show that the improved method is successful on the motion background compensation and offers great promise in tracking moving objects of the dynamic image sequence.
基金National Natural Science Foundation of China(No.61201412)
文摘In practical application,mean shift tracking algorithm is easy to generate tracking drift when the target and the background have similar color distribution.Based on the mean shift algorithm,a kind of background weaken weight is proposed in the paper firstly.Combining with the object center weight based on the kernel function,the problem of interference of the similar color background can be solved.And then,a model updating strategy is presented to improve the tracking robustness on the influence of occlusion,illumination,deformation and so on.With the test on the sequence of Tiger,the proposed approach provides better performance than the original mean shift tracking algorithm.
基金supported in part by the Institute for Guo Qiang of Tsinghua University(2019GQG1023)in part by Graduate Education and Teaching Reform Project of Tsinghua University(202007J007)+1 种基金in part by National Natural Science Foundation of China(U19B2029,62073028,61803222)in part by the Independent Research Program of Tsinghua University(2018Z05JDX002)。
文摘There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most challenging problems.In this paper,we present a visual object tracking and servoing control system utilizing a tailor-made 38 g nano-scale quadrotor.A lightweight visual module is integrated to enable object tracking capabilities,and a micro positioning deck is mounted to provide accurate pose estimation.In order to be robust against object appearance variations,a novel object tracking algorithm,denoted by RMCTer,is proposed,which integrates a powerful short-term tracking module and an efficient long-term processing module.In particular,the long-term processing module can provide additional object information and modify the short-term tracking model in a timely manner.Furthermore,a positionbased visual servoing control method is proposed for the quadrotor,where an adaptive tracking controller is designed by leveraging backstepping and adaptive techniques.Stable and accurate object tracking is achieved even under disturbances.Experimental results are presented to demonstrate the high accuracy and stability of the whole tracking system.
基金the Framework of International Cooperation Program managed by the National Research Foundation of Korea(2019K1A3A1A8011295711).
文摘Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It allows the deployment of smart cameras or optical sensors with computer vision techniques,which may serve in several object detection and tracking tasks.These tasks have been considered challenging and high-level perceptual problems,frequently dominated by relative information about the environment,where main concerns such as occlusion,illumination,background,object deformation,and object class variations are commonplace.In order to show the importance of top view surveillance,a collaborative robotics framework has been presented.It can assist in the detection and tracking of multiple objects in top view surveillance.The framework consists of a smart robotic camera embedded with the visual processing unit.The existing pre-trained deep learning models named SSD and YOLO has been adopted for object detection and localization.The detection models are further combined with different tracking algorithms,including GOTURN,MEDIANFLOW,TLD,KCF,MIL,and BOOSTING.These algorithms,along with detection models,help to track and predict the trajectories of detected objects.The pre-trained models are employed;therefore,the generalization performance is also investigated through testing the models on various sequences of top view data set.The detection models achieved maximum True Detection Rate 93%to 90%with a maximum 0.6%False Detection Rate.The tracking results of different algorithms are nearly identical,with tracking accuracy ranging from 90%to 94%.Furthermore,a discussion has been carried out on output results along with future guidelines.
基金supported by the National Natural Science Foundation of China(Grant No.51009040)Heilongjiang Postdoctoral Fund(Grant No.LBH-Z11205)+1 种基金the National High Technology Research and Development Program of China(863 Program,Grant No.2011AA09A106)the China Postdoctoral Science Foundation(Grant No.2012M510928)
文摘This paper describes a new framework for object detection and tracking of AUV including underwater acoustic data interpolation, underwater acoustic images segmentation and underwater objects tracking. This framework is applied to the design of vision-based method for AUV based on the forward looking sonar sensor. First, the real-time data flow (underwater acoustic images) is pre-processed to form the whole underwater acoustic image, and the relevant position information of objects is extracted and determined. An improved method of double threshold segmentation is proposed to resolve the problem that the threshold cannot be adjusted adaptively in the traditional method. Second, a representation of region information is created in light of the Gaussian particle filter. The weighted integration strategy combining the area and invariant moment is proposed to perfect the weight of particles and to enhance the tracking robustness. Results obtained on the real acoustic vision platform of AUV during sea trials are displayed and discussed. They show that the proposed method can detect and track the moving objects underwater online, and it is effective and robust.
基金supported by National Basic Research Program of China (973 Program) (No. 2006CB300407)National Natural Science Foundation of China (No. 50775017)
文摘Inspired by human behaviors, a robot object tracking model is proposed on the basis of visual attention mechanism, which is fit for the theory of topological perception. The model integrates the image-driven, bottom-up attention and the object-driven, top-down attention, whereas the previous attention model has mostly focused on either the bottom-up or top-down attention. By the bottom-up component, the whole scene is segmented into the ground region and the salient regions. Guided by top-down strategy which is achieved by a topological graph, the object regions are separated from the salient regions. The salient regions except the object regions are the barrier regions. In order to estimate the model, a mobile robot platform is developed, on which some experiments are implemented. The experimental results indicate that processing an image with a resolution of 752 × 480 pixels takes less than 200 ms and the object regions are unabridged. The analysis obtained by comparing the proposed model with the existing model demonstrates that the proposed model has some advantages in robot object tracking in terms of speed and efficiency.
文摘A method for moving object recognition and tracking in the intelligent traffic monitoring system is presented. For the shortcomings and deficiencies of the frame-subtraction method, a redundant discrete wavelet transform (RDWT) based moving object recognition algorithm is put forward, which directly detects moving objects in the redundant discrete wavelet transform domain. An improved adaptive mean-shift algorithm is used to track the moving object in the follow up frames. Experimental results show that the algorithm can effectively extract the moving object, even though the object is similar to the background, and the results are better than the traditional frame-subtraction method. The object tracking is accurate without the impact of changes in the size of the object. Therefore the algorithm has a certain practical value and prospect.
基金This work was supported in part by the Beijing Natural Science Foundation(L191004)the National Natural Science Foundation of China under No.61720106007 and No.61872047+1 种基金the Beijing Nova Program under No.Z201100006820124the Funds for Cre ative Research Groups of China under No.61921003,and the 111 Project(B18008).
文摘In this paper,we provide a new approach for intelligent traffic transportation in the intelligent vehicular networks,which aims at collecting the vehicles’locations,trajectories and other key driving parameters for the time-critical autonomous driving’s requirement.The key of our method is a multi-vehicle tracking framework in the traffic monitoring scenario..Our proposed framework is composed of three modules:multi-vehicle detection,multi-vehicle association and miss-detected vehicle tracking.For the first module,we integrate self-attention mechanism into detector of using key point estimation for better detection effect.For the second module,we apply the multi-dimensional information for robustness promotion,including vehicle re-identification(Re-ID)features,historical trajectory information,and spatial position information For the third module,we re-track the miss-detected vehicles with occlusions in the first detection module.Besides,we utilize the asymmetric convolution and depth-wise separable convolution to reduce the model’s parameters for speed-up.Extensive experimental results show the effectiveness of our proposed multi-vehicle tracking framework.
文摘In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ICAN(ICT Challenge and Advanced Network of HRD)program(IITP-2023-RS-2022-00156326)supervised by the IITP(Institute of Information&Communications Technology Planning&Evaluation).
文摘Unmanned aerial vehicles(UAVs)can be used to monitor traffic in a variety of settings,including security,traffic surveillance,and traffic control.Numerous academics have been drawn to this topic because of the challenges and the large variety of applications.This paper proposes a new and efficient vehicle detection and tracking system that is based on road extraction and identifying objects on it.It is inspired by existing detection systems that comprise stationary data collectors such as induction loops and stationary cameras that have a limited field of view and are not mobile.The goal of this study is to develop a method that first extracts the region of interest(ROI),then finds and tracks the items of interest.The suggested system is divided into six stages.The photos from the obtained dataset are appropriately georeferenced to their actual locations in the first phase,after which they are all co-registered.The ROI,or road and its objects,are retrieved using the GrabCut method in the second phase.The third phase entails data preparation.The segmented images’noise is eliminated using Gaussian blur,after which the images are changed to grayscale and forwarded to the following stage for additional morphological procedures.The YOLOv3 algorithm is used in the fourth step to find any automobiles in the photos.Following that,the Kalman filter and centroid tracking are used to perform the tracking of the detected cars.The Lucas-Kanade method is then used to perform the trajectory analysis on the vehicles.The suggested model is put to the test and assessed using the Vehicle Aerial Imaging from Drone(VAID)dataset.For detection and tracking,the model was able to attain accuracy levels of 96.7%and 91.6%,respectively.
基金supported by the Zhejiang Key Laboratory of General Aviation Operation Technology(No.JDGA2020-7)the National Natural Science Foundation of China(No.62173237)+3 种基金the Natural Science Foundation of Liaoning Province(No.2019-MS-251)the Talent Project of Revitalization Liaoning Province(No.XLYC1907022)the Key R&D Projects of Liaoning Province(No.2020JH2/10100045)the High-Level Innovation Talent Project of Shenyang(No.RC190030).
文摘Single object tracking based on deep learning has achieved the advanced performance in many applications of computer vision.However,the existing trackers have certain limitations owing to deformation,occlusion,movement and some other conditions.We propose a siamese attentional dense network called SiamADN in an end-to-end offline manner,especially aiming at unmanned aerial vehicle(UAV)tracking.First,it applies a dense network to reduce vanishing-gradient,which strengthens the features transfer.Second,the channel attention mechanism is involved into the Densenet structure,in order to focus on the possible key regions.The advance corner detection network is introduced to improve the following tracking process.Extensive experiments are carried out on four mainly tracking benchmarks as OTB-2015,UAV123,LaSOT and VOT.The accuracy rate on UAV123 is 78.9%,and the running speed is 32 frame per second(FPS),which demonstrates its efficiency in the practical real application.
基金supported by National Natural Science Foundationof China (Grant No. 51874300)the National Natural Science Foundation of China andShanxi Provincial People’s Government Jointly Funded Project of China for Coal Baseand Low Carbon (Grant No. U1510115)+2 种基金National Natural Science Foundation of China(51104157)the Qing Lan Project, the China Postdoctoral Science Foundation (Grant No.2013T60574)the Scientific Instrument Developing Project of the Chinese Academy ofSciences (Grant No. YJKYYQ20170074).
文摘Video object tracking is an important research topic of computer vision, whichfinds a wide range of applications in video surveillance, robotics, human-computerinteraction and so on. Although many moving object tracking algorithms have beenproposed, there are still many difficulties in the actual tracking process, such asillumination change, occlusion, motion blurring, scale change, self-change and so on.Therefore, the development of object tracking technology is still challenging. Theemergence of deep learning theory and method provides a new opportunity for theresearch of object tracking, and it is also the main theoretical framework for the researchof moving object tracking algorithm in this paper. In this paper, the existing deeptracking-based target tracking algorithms are classified and sorted out. Based on theprevious knowledge and my own understanding, several solutions are proposed for theexisting methods. In addition, the existing deep learning target tracking method is stilldifficult to meet the requirements of real-time, how to design the network and trackingprocess to achieve speed and effect improvement, there is still a lot of research space.