Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have becom...Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.展开更多
Significant advancements have beenwitnessed in visual tracking applications leveragingViT in recent years,mainly due to the formidablemodeling capabilities of Vision Transformer(ViT).However,the strong performance of ...Significant advancements have beenwitnessed in visual tracking applications leveragingViT in recent years,mainly due to the formidablemodeling capabilities of Vision Transformer(ViT).However,the strong performance of such trackers heavily relies on ViT models pretrained for long periods,limitingmore flexible model designs for tracking tasks.To address this issue,we propose an efficient unsupervised ViT pretraining method for the tracking task based on masked autoencoders,called TrackMAE.During pretraining,we employ two shared-parameter ViTs,serving as the appearance encoder and motion encoder,respectively.The appearance encoder encodes randomly masked image data,while the motion encoder encodes randomly masked pairs of video frames.Subsequently,an appearance decoder and a motion decoder separately reconstruct the original image data and video frame data at the pixel level.In this way,ViT learns to understand both the appearance of images and the motion between video frames simultaneously.Experimental results demonstrate that ViT-Base and ViT-Large models,pretrained with TrackMAE and combined with a simple tracking head,achieve state-of-the-art(SOTA)performance without additional design.Moreover,compared to the currently popular MAE pretraining methods,TrackMAE consumes only 1/5 of the training time,which will facilitate the customization of diverse models for tracking.For instance,we additionally customize a lightweight ViT-XS,which achieves SOTA efficient tracking performance.展开更多
Visual object-tracking is a fundamental task applied in many applications of computer vision. Particle filter is one of the techniques which has been widely used in object tracking. Due to the virtue of extendability ...Visual object-tracking is a fundamental task applied in many applications of computer vision. Particle filter is one of the techniques which has been widely used in object tracking. Due to the virtue of extendability and flexibility on both linear and non-linear environments, various particle filter-based trackers have been proposed in the literature. However, the conventional approach cannot handle very large videos efficiently in the current data intensive information age. In this work, a parallelized particle filter is provided in a distributed framework provided by the Hadoop/Map-Reduce infrastructure to tackle object-tracking tasks. The experiments indicate that the proposed algorithm has a better convergence and accuracy as compared to the traditional particle filter. The computational power and the scalability of the proposed particle filter in single object tracking have been enhanced as well.展开更多
This paper discusses about the new approach of multiple object track-ing relative to background information.The concept of multiple object tracking through background learning is based upon the theory of relativity,th...This paper discusses about the new approach of multiple object track-ing relative to background information.The concept of multiple object tracking through background learning is based upon the theory of relativity,that involves a frame of reference in spatial domain to localize and/or track any object.Thefield of multiple object tracking has seen a lot of research,but researchers have considered the background as redundant.However,in object tracking,the back-ground plays a vital role and leads to definite improvement in the overall process of tracking.In the present work an algorithm is proposed for the multiple object tracking through background learning.The learning framework is based on graph embedding approach for localizing multiple objects.The graph utilizes the inher-ent capabilities of depth modelling that assist in prior to track occlusion avoidance among multiple objects.The proposed algorithm has been compared with the recent work available in literature on numerous performance evaluation measures.It is observed that our proposed algorithm gives better performance.展开更多
Object tracking,an important technology in the field of image processing and computer vision,is used to continuously track a specific object or person in an image.This technology may be effective in identifying the sa...Object tracking,an important technology in the field of image processing and computer vision,is used to continuously track a specific object or person in an image.This technology may be effective in identifying the same person within one image,but it has limitations in handling multiple images owing to the difficulty in identifying whether the object appearing in other images is the same.When tracking the same object using two or more images,there must be a way to determine that objects existing in different images are the same object.Therefore,this paper attempts to determine the same object present in different images using color information among the unique information of the object.Thus,this study proposes a multiple-object-tracking method using histogram stamp extraction in closed-circuit television applications.The proposed method determines the presence or absence of a target object in an image by comparing the similarity between the image containing the target object and other images.To this end,a unique color value of the target object is extracted based on its color distribution in the image using three methods:mean,mode,and interquartile range.The Top-N accuracy method is used to analyze the accuracy of each method,and the results show that the mean method had an accuracy of 93.5%(Top-2).Furthermore,the positive prediction value experimental results show that the accuracy of the mean method was 65.7%.As a result of the analysis,it is possible to detect and track the same object present in different images using the unique color of the object.Through the results,it is possible to track the same object that can minimize manpower without using personal information when detecting objects in different images.In the last response speed experiment,it was shown that when the mean was used,the color extraction of the object was possible in real time with 0.016954 s.Through this,it is possible to detect and track the same object in real time when using the proposed method.展开更多
This research introduces a challenge in integrating and cleaning the data,which is a crucial task in object matching.While the object is detected and then measured,the vibration at different light intensities may influ...This research introduces a challenge in integrating and cleaning the data,which is a crucial task in object matching.While the object is detected and then measured,the vibration at different light intensities may influence the durability and reliability of mechanical systems or structures and cause problems such as damage,abnormal stopping,and disaster.Recent research failed to improve the accuracy rate and the computation time in tracking an object and in the vibration measurement.To solve all these problems,this proposed research simplifies the scaling factor determination by assigning a known real-world dimension to a predetermined portion of the image.A novel white color sticker of the known dimensions marked with a color dot is pasted on the surface of an object for the best result in the template matching using the Improved Up-Sampled Cross-Correlation(UCC)algorithm.The vibration measurement is calculated using the Finite-Difference Algorithm(FDA),a machine vision systemfitted with a macro lens sensor that is capable of capturing the image at a closer range,which does not affect the quality of displacement measurement from the video frames.Thefield test was conducted on the TAFE(Tractors and Farm Equipment Limited)tractor parts,and the percentage of error was recorded between 30%and 50%at very low vibration values close to zero,whereas it was recorded between 5%and 10%error in most high-accelerations,the essential range for vibration analysis.Finally,the suggested system is more suitable for measuring the vibration of stationary machinery having low frequency ranges.The use of a macro lens enables to capture of image frames at very close-ups.A 30%to 50%error percentage has been reported when the vibration amplitude is very small.Therefore,this study is not suitable for Nano vibration analysis.展开更多
Aiming at the problem that a single correlation filter model is sensitive to complex scenes such as background interference and occlusion,a tracking algorithm based on multi-time-space perception and instance-specific...Aiming at the problem that a single correlation filter model is sensitive to complex scenes such as background interference and occlusion,a tracking algorithm based on multi-time-space perception and instance-specific proposals is proposed to optimize the mathematical model of the correlation filter(CF).Firstly,according to the consistency of the changes between the object frames and the filter frames,the mask matrix is introduced into the objective function of the filter,so as to extract the spatio-temporal information of the object with background awareness.Secondly,the object function of multi-feature fusion is constructed for the object location,which is optimized by the Lagrange method and solved by closed iteration.In the process of filter optimization,the constraints term of time-space perception is designed to enhance the learning ability of the CF to optimize the final track-ing results.Finally,when the tracking results fluctuate,the boundary suppres-sion factor is introduced into the instance-specific proposals to reduce the risk of model drift effectively.The accuracy and success rate of the proposed algorithm are verified by simulation analysis on two popular benchmarks,the object tracking benchmark 2015(OTB2015)and the temple color 128(TC-128).Extensive experimental results illustrate that the optimized appearance model of the proposed algorithm is effective.The distance precision rate and overlap success rate of the proposed algorithm are 0.756 and 0.656 on the OTB2015 benchmark,which are better than the results of other competing algorithms.The results of this study can solve the problem of real-time object tracking in the real traffic environment and provide a specific reference for the detection of traffic abnormalities.展开更多
The field of object tracking has recently made significant progress.Particularly,the performance results in both deep learning and correlation filters,based trackers achieved effective tracking performance.Moreover,th...The field of object tracking has recently made significant progress.Particularly,the performance results in both deep learning and correlation filters,based trackers achieved effective tracking performance.Moreover,there are still some difficulties with object tracking for example illumination and deformation(DEF).The precision and accuracy of tracking algorithms suffer from the effects of such occurrences.For this situation,finding a solution is important.This research proposes a new tracking algorithm to handle this problem.The features are extracted by using Modified LeNet-5,and the precision and accuracy are improved by developing the Real-Time Cross-modality Correlation Filtering method(RCCF).In Modified LeNet-5,the visual tracking performance is improved by adjusting the number and size of the convolution kernels in the pooling and convolution layers.The high-level,middle-level,and handcraft features are extracted from the modified LeNet-5 network.The handcraft features are used to determine the specific location of the target because the handcraft features contain more spatial information regarding the visual object.The LeNet features are more suitable for a target appearance change in object tracking.Extensive experiments were conducted by the Object Tracking Benchmarking(OTB)databases like OTB50 and OTB100.The experimental results reveal that the proposed tracker outperforms other state-of-the-art trackers under different problems.The experimental simulation is carried out in python.The overall success rate and precision of the proposed algorithm are 93.8%and 92.5%.The average running frame rate reaches 42 frames per second,which can meet the real-time requirements.展开更多
On grounds of the advent of real-time applications,like autonomous driving,visual surveillance,and sports analysis,there is an augmenting focus of attention towards Multiple-Object Tracking(MOT).The tracking-by-detect...On grounds of the advent of real-time applications,like autonomous driving,visual surveillance,and sports analysis,there is an augmenting focus of attention towards Multiple-Object Tracking(MOT).The tracking-by-detection paradigm,a commonly utilized approach,connects the existing recognition hypotheses to the formerly assessed object trajectories by comparing the simila-rities of the appearance or the motion between them.For an efficient detection and tracking of the numerous objects in a complex environment,a Pearson Simi-larity-centred Kuhn-Munkres(PS-KM)algorithm was proposed in the present study.In this light,the input videos were,initially,gathered from the MOT dataset and converted into frames.The background subtraction occurred whichfiltered the inappropriate data concerning the frames after the frame conversion stage.Then,the extraction of features from the frames was executed.Afterwards,the higher dimensional features were transformed into lower-dimensional features,and feature reduction process was performed with the aid of Information Gain-centred Singular Value Decomposition(IG-SVD).Next,using the Modified Recurrent Neural Network(MRNN)method,classification was executed which identified the categories of the objects additionally.The PS-KM algorithm identi-fied that the recognized objects were tracked.Finally,the experimental outcomes exhibited that numerous targets were precisely tracked by the proposed system with 97%accuracy with a low false positive rate(FPR)of 2.3%.It was also proved that the present techniques viz.RNN,CNN,and KNN,were effective with regard to the existing models.展开更多
The amount of needed control messages in wireless sensor networks(WSN)is affected by the storage strategy of detected events.Because broadcasting superfluous control messages consumes excess energy,the network lifespa...The amount of needed control messages in wireless sensor networks(WSN)is affected by the storage strategy of detected events.Because broadcasting superfluous control messages consumes excess energy,the network lifespan can be extended if the quantity of control messages is decreased.In this study,an optimized storage technique having low control overhead for tracking the objects in WSN is introduced.The basic concept is to retain observed events in internal memory and preserve the relationship between sensed information and sensor nodes using a novel inexpensive data structure entitled Ordered Binary Linked List(OBLL).Whenever an object passes over the sensor area,the recognizing sensor can immediately produce an OBLL along the object’s route.To retrieve the entire information,the OBLL can be traversed with logarithmic complexity which is much less than the traversing complexity of existing linked list structures.Performance evaluation and simulations were carried out to ensure that the suggested technique minimizes the number of messages and thus saving energy and extending the network life.展开更多
An improved estimation of motion vectors of feature points is proposed for tracking moving objects of dynamic image sequence. Feature points are firstly extracted by the improved minimum intensity change (MIC) algor...An improved estimation of motion vectors of feature points is proposed for tracking moving objects of dynamic image sequence. Feature points are firstly extracted by the improved minimum intensity change (MIC) algorithm. The matching points of these feature points are then determined by adaptive rood pattern searching. Based on the random sample consensus (RANSAC) method, the background motion is finally compensated by the parameters of an affine transform of the background motion. With reasonable morphological filtering, the moving objects are completely extracted from the background, and then tracked accurately. Experimental results show that the improved method is successful on the motion background compensation and offers great promise in tracking moving objects of the dynamic image sequence.展开更多
There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most ...There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most challenging problems.In this paper,we present a visual object tracking and servoing control system utilizing a tailor-made 38 g nano-scale quadrotor.A lightweight visual module is integrated to enable object tracking capabilities,and a micro positioning deck is mounted to provide accurate pose estimation.In order to be robust against object appearance variations,a novel object tracking algorithm,denoted by RMCTer,is proposed,which integrates a powerful short-term tracking module and an efficient long-term processing module.In particular,the long-term processing module can provide additional object information and modify the short-term tracking model in a timely manner.Furthermore,a positionbased visual servoing control method is proposed for the quadrotor,where an adaptive tracking controller is designed by leveraging backstepping and adaptive techniques.Stable and accurate object tracking is achieved even under disturbances.Experimental results are presented to demonstrate the high accuracy and stability of the whole tracking system.展开更多
This paper describes a new framework for object detection and tracking of AUV including underwater acoustic data interpolation, underwater acoustic images segmentation and underwater objects tracking. This framework i...This paper describes a new framework for object detection and tracking of AUV including underwater acoustic data interpolation, underwater acoustic images segmentation and underwater objects tracking. This framework is applied to the design of vision-based method for AUV based on the forward looking sonar sensor. First, the real-time data flow (underwater acoustic images) is pre-processed to form the whole underwater acoustic image, and the relevant position information of objects is extracted and determined. An improved method of double threshold segmentation is proposed to resolve the problem that the threshold cannot be adjusted adaptively in the traditional method. Second, a representation of region information is created in light of the Gaussian particle filter. The weighted integration strategy combining the area and invariant moment is proposed to perfect the weight of particles and to enhance the tracking robustness. Results obtained on the real acoustic vision platform of AUV during sea trials are displayed and discussed. They show that the proposed method can detect and track the moving objects underwater online, and it is effective and robust.展开更多
Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It a...Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It allows the deployment of smart cameras or optical sensors with computer vision techniques,which may serve in several object detection and tracking tasks.These tasks have been considered challenging and high-level perceptual problems,frequently dominated by relative information about the environment,where main concerns such as occlusion,illumination,background,object deformation,and object class variations are commonplace.In order to show the importance of top view surveillance,a collaborative robotics framework has been presented.It can assist in the detection and tracking of multiple objects in top view surveillance.The framework consists of a smart robotic camera embedded with the visual processing unit.The existing pre-trained deep learning models named SSD and YOLO has been adopted for object detection and localization.The detection models are further combined with different tracking algorithms,including GOTURN,MEDIANFLOW,TLD,KCF,MIL,and BOOSTING.These algorithms,along with detection models,help to track and predict the trajectories of detected objects.The pre-trained models are employed;therefore,the generalization performance is also investigated through testing the models on various sequences of top view data set.The detection models achieved maximum True Detection Rate 93%to 90%with a maximum 0.6%False Detection Rate.The tracking results of different algorithms are nearly identical,with tracking accuracy ranging from 90%to 94%.Furthermore,a discussion has been carried out on output results along with future guidelines.展开更多
Inspired by human behaviors, a robot object tracking model is proposed on the basis of visual attention mechanism, which is fit for the theory of topological perception. The model integrates the image-driven, bottom-u...Inspired by human behaviors, a robot object tracking model is proposed on the basis of visual attention mechanism, which is fit for the theory of topological perception. The model integrates the image-driven, bottom-up attention and the object-driven, top-down attention, whereas the previous attention model has mostly focused on either the bottom-up or top-down attention. By the bottom-up component, the whole scene is segmented into the ground region and the salient regions. Guided by top-down strategy which is achieved by a topological graph, the object regions are separated from the salient regions. The salient regions except the object regions are the barrier regions. In order to estimate the model, a mobile robot platform is developed, on which some experiments are implemented. The experimental results indicate that processing an image with a resolution of 752 × 480 pixels takes less than 200 ms and the object regions are unabridged. The analysis obtained by comparing the proposed model with the existing model demonstrates that the proposed model has some advantages in robot object tracking in terms of speed and efficiency.展开更多
A method for moving object recognition and tracking in the intelligent traffic monitoring system is presented. For the shortcomings and deficiencies of the frame-subtraction method, a redundant discrete wavelet transf...A method for moving object recognition and tracking in the intelligent traffic monitoring system is presented. For the shortcomings and deficiencies of the frame-subtraction method, a redundant discrete wavelet transform (RDWT) based moving object recognition algorithm is put forward, which directly detects moving objects in the redundant discrete wavelet transform domain. An improved adaptive mean-shift algorithm is used to track the moving object in the follow up frames. Experimental results show that the algorithm can effectively extract the moving object, even though the object is similar to the background, and the results are better than the traditional frame-subtraction method. The object tracking is accurate without the impact of changes in the size of the object. Therefore the algorithm has a certain practical value and prospect.展开更多
An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method...An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method using the object models is proposed to track multiple objects in a real-time visual surveillance system. Firstly, for detecting objects, an adaptive kernel density estimation method is utilized, which uses an adaptive bandwidth and features combining colour and gradient. Secondly, some models of objects are built for describing motion, shape and colour features. Then, a matching matrix is formed to analyze tracking situations. If objects are tracked under occlusions, the optimal "visual" object is found to represent the occluded object, and the posterior probability of pixel is used to determine which pixel is utilized for updating object models. Extensive experiments show that this method improves the accuracy and validity of tracking objects even under occlusions and is used in real-time visual surveillance systems.展开更多
Single object tracking based on deep learning has achieved the advanced performance in many applications of computer vision.However,the existing trackers have certain limitations owing to deformation,occlusion,movemen...Single object tracking based on deep learning has achieved the advanced performance in many applications of computer vision.However,the existing trackers have certain limitations owing to deformation,occlusion,movement and some other conditions.We propose a siamese attentional dense network called SiamADN in an end-to-end offline manner,especially aiming at unmanned aerial vehicle(UAV)tracking.First,it applies a dense network to reduce vanishing-gradient,which strengthens the features transfer.Second,the channel attention mechanism is involved into the Densenet structure,in order to focus on the possible key regions.The advance corner detection network is introduced to improve the following tracking process.Extensive experiments are carried out on four mainly tracking benchmarks as OTB-2015,UAV123,LaSOT and VOT.The accuracy rate on UAV123 is 78.9%,and the running speed is 32 frame per second(FPS),which demonstrates its efficiency in the practical real application.展开更多
Video object tracking is an important research topic of computer vision, whichfinds a wide range of applications in video surveillance, robotics, human-computerinteraction and so on. Although many moving object tracki...Video object tracking is an important research topic of computer vision, whichfinds a wide range of applications in video surveillance, robotics, human-computerinteraction and so on. Although many moving object tracking algorithms have beenproposed, there are still many difficulties in the actual tracking process, such asillumination change, occlusion, motion blurring, scale change, self-change and so on.Therefore, the development of object tracking technology is still challenging. Theemergence of deep learning theory and method provides a new opportunity for theresearch of object tracking, and it is also the main theoretical framework for the researchof moving object tracking algorithm in this paper. In this paper, the existing deeptracking-based target tracking algorithms are classified and sorted out. Based on theprevious knowledge and my own understanding, several solutions are proposed for theexisting methods. In addition, the existing deep learning target tracking method is stilldifficult to meet the requirements of real-time, how to design the network and trackingprocess to achieve speed and effect improvement, there is still a lot of research space.展开更多
If a somewhat fast moving object exists in a complicated tracking environment, snake's nodes may fall into the inaccurate local minima. We propose a mean shift snake algorithm to solve this problem. However, if th...If a somewhat fast moving object exists in a complicated tracking environment, snake's nodes may fall into the inaccurate local minima. We propose a mean shift snake algorithm to solve this problem. However, if the object goes beyond the limits of mean shift snake module operation in suc- cessive sequences, mean shift snake's nodes may also fall into the local minima in their moving to the new object position. This paper presents a motion compensation strategy by using particle filter; therefore a new Particle Filter Mean Shift Snake (PFMSS) algorithm is proposed which combines particle filter with mean shift snake to fulfill the estimation of the fast moving object contour. Firstly, the fast moving object is tracked by particle filter to create a coarse position which is used to initialize the mean shift algorithm. Secondly, the whole relevant motion information is used to compensate the snake's node positions. Finally, snake algorithm is used to extract the exact object contour and the useful information of the object is fed back. Some real world sequences are tested and the results show that the novel tracking method have a good performance with high accuracy in solving the fast moving problems in cluttered background.展开更多
基金supported by the National Natural Science Foundation of China under Grant 62177029the Postgraduate Research&Practice Innovation Program of Jiangsu Province(KYCX21_0740),China.
文摘Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.
基金supported in part by National Natural Science Foundation of China(No.62176041)in part by Excellent Science and Technique Talent Foundation of Dalian(No.2022RY21).
文摘Significant advancements have beenwitnessed in visual tracking applications leveragingViT in recent years,mainly due to the formidablemodeling capabilities of Vision Transformer(ViT).However,the strong performance of such trackers heavily relies on ViT models pretrained for long periods,limitingmore flexible model designs for tracking tasks.To address this issue,we propose an efficient unsupervised ViT pretraining method for the tracking task based on masked autoencoders,called TrackMAE.During pretraining,we employ two shared-parameter ViTs,serving as the appearance encoder and motion encoder,respectively.The appearance encoder encodes randomly masked image data,while the motion encoder encodes randomly masked pairs of video frames.Subsequently,an appearance decoder and a motion decoder separately reconstruct the original image data and video frame data at the pixel level.In this way,ViT learns to understand both the appearance of images and the motion between video frames simultaneously.Experimental results demonstrate that ViT-Base and ViT-Large models,pretrained with TrackMAE and combined with a simple tracking head,achieve state-of-the-art(SOTA)performance without additional design.Moreover,compared to the currently popular MAE pretraining methods,TrackMAE consumes only 1/5 of the training time,which will facilitate the customization of diverse models for tracking.For instance,we additionally customize a lightweight ViT-XS,which achieves SOTA efficient tracking performance.
文摘Visual object-tracking is a fundamental task applied in many applications of computer vision. Particle filter is one of the techniques which has been widely used in object tracking. Due to the virtue of extendability and flexibility on both linear and non-linear environments, various particle filter-based trackers have been proposed in the literature. However, the conventional approach cannot handle very large videos efficiently in the current data intensive information age. In this work, a parallelized particle filter is provided in a distributed framework provided by the Hadoop/Map-Reduce infrastructure to tackle object-tracking tasks. The experiments indicate that the proposed algorithm has a better convergence and accuracy as compared to the traditional particle filter. The computational power and the scalability of the proposed particle filter in single object tracking have been enhanced as well.
文摘This paper discusses about the new approach of multiple object track-ing relative to background information.The concept of multiple object tracking through background learning is based upon the theory of relativity,that involves a frame of reference in spatial domain to localize and/or track any object.Thefield of multiple object tracking has seen a lot of research,but researchers have considered the background as redundant.However,in object tracking,the back-ground plays a vital role and leads to definite improvement in the overall process of tracking.In the present work an algorithm is proposed for the multiple object tracking through background learning.The learning framework is based on graph embedding approach for localizing multiple objects.The graph utilizes the inher-ent capabilities of depth modelling that assist in prior to track occlusion avoidance among multiple objects.The proposed algorithm has been compared with the recent work available in literature on numerous performance evaluation measures.It is observed that our proposed algorithm gives better performance.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.2022R1F1A1068828).
文摘Object tracking,an important technology in the field of image processing and computer vision,is used to continuously track a specific object or person in an image.This technology may be effective in identifying the same person within one image,but it has limitations in handling multiple images owing to the difficulty in identifying whether the object appearing in other images is the same.When tracking the same object using two or more images,there must be a way to determine that objects existing in different images are the same object.Therefore,this paper attempts to determine the same object present in different images using color information among the unique information of the object.Thus,this study proposes a multiple-object-tracking method using histogram stamp extraction in closed-circuit television applications.The proposed method determines the presence or absence of a target object in an image by comparing the similarity between the image containing the target object and other images.To this end,a unique color value of the target object is extracted based on its color distribution in the image using three methods:mean,mode,and interquartile range.The Top-N accuracy method is used to analyze the accuracy of each method,and the results show that the mean method had an accuracy of 93.5%(Top-2).Furthermore,the positive prediction value experimental results show that the accuracy of the mean method was 65.7%.As a result of the analysis,it is possible to detect and track the same object present in different images using the unique color of the object.Through the results,it is possible to track the same object that can minimize manpower without using personal information when detecting objects in different images.In the last response speed experiment,it was shown that when the mean was used,the color extraction of the object was possible in real time with 0.016954 s.Through this,it is possible to detect and track the same object in real time when using the proposed method.
文摘This research introduces a challenge in integrating and cleaning the data,which is a crucial task in object matching.While the object is detected and then measured,the vibration at different light intensities may influence the durability and reliability of mechanical systems or structures and cause problems such as damage,abnormal stopping,and disaster.Recent research failed to improve the accuracy rate and the computation time in tracking an object and in the vibration measurement.To solve all these problems,this proposed research simplifies the scaling factor determination by assigning a known real-world dimension to a predetermined portion of the image.A novel white color sticker of the known dimensions marked with a color dot is pasted on the surface of an object for the best result in the template matching using the Improved Up-Sampled Cross-Correlation(UCC)algorithm.The vibration measurement is calculated using the Finite-Difference Algorithm(FDA),a machine vision systemfitted with a macro lens sensor that is capable of capturing the image at a closer range,which does not affect the quality of displacement measurement from the video frames.Thefield test was conducted on the TAFE(Tractors and Farm Equipment Limited)tractor parts,and the percentage of error was recorded between 30%and 50%at very low vibration values close to zero,whereas it was recorded between 5%and 10%error in most high-accelerations,the essential range for vibration analysis.Finally,the suggested system is more suitable for measuring the vibration of stationary machinery having low frequency ranges.The use of a macro lens enables to capture of image frames at very close-ups.A 30%to 50%error percentage has been reported when the vibration amplitude is very small.Therefore,this study is not suitable for Nano vibration analysis.
基金funded by the Basic Science Major Foundation(Natural Science)of the Jiangsu Higher Education Institutions of China(Grant:22KJA520012)the Xuzhou Science and Technology Plan Project(Grant:KC21303,KC22305)the sixth“333 project”of Jiangsu Province.
文摘Aiming at the problem that a single correlation filter model is sensitive to complex scenes such as background interference and occlusion,a tracking algorithm based on multi-time-space perception and instance-specific proposals is proposed to optimize the mathematical model of the correlation filter(CF).Firstly,according to the consistency of the changes between the object frames and the filter frames,the mask matrix is introduced into the objective function of the filter,so as to extract the spatio-temporal information of the object with background awareness.Secondly,the object function of multi-feature fusion is constructed for the object location,which is optimized by the Lagrange method and solved by closed iteration.In the process of filter optimization,the constraints term of time-space perception is designed to enhance the learning ability of the CF to optimize the final track-ing results.Finally,when the tracking results fluctuate,the boundary suppres-sion factor is introduced into the instance-specific proposals to reduce the risk of model drift effectively.The accuracy and success rate of the proposed algorithm are verified by simulation analysis on two popular benchmarks,the object tracking benchmark 2015(OTB2015)and the temple color 128(TC-128).Extensive experimental results illustrate that the optimized appearance model of the proposed algorithm is effective.The distance precision rate and overlap success rate of the proposed algorithm are 0.756 and 0.656 on the OTB2015 benchmark,which are better than the results of other competing algorithms.The results of this study can solve the problem of real-time object tracking in the real traffic environment and provide a specific reference for the detection of traffic abnormalities.
文摘The field of object tracking has recently made significant progress.Particularly,the performance results in both deep learning and correlation filters,based trackers achieved effective tracking performance.Moreover,there are still some difficulties with object tracking for example illumination and deformation(DEF).The precision and accuracy of tracking algorithms suffer from the effects of such occurrences.For this situation,finding a solution is important.This research proposes a new tracking algorithm to handle this problem.The features are extracted by using Modified LeNet-5,and the precision and accuracy are improved by developing the Real-Time Cross-modality Correlation Filtering method(RCCF).In Modified LeNet-5,the visual tracking performance is improved by adjusting the number and size of the convolution kernels in the pooling and convolution layers.The high-level,middle-level,and handcraft features are extracted from the modified LeNet-5 network.The handcraft features are used to determine the specific location of the target because the handcraft features contain more spatial information regarding the visual object.The LeNet features are more suitable for a target appearance change in object tracking.Extensive experiments were conducted by the Object Tracking Benchmarking(OTB)databases like OTB50 and OTB100.The experimental results reveal that the proposed tracker outperforms other state-of-the-art trackers under different problems.The experimental simulation is carried out in python.The overall success rate and precision of the proposed algorithm are 93.8%and 92.5%.The average running frame rate reaches 42 frames per second,which can meet the real-time requirements.
文摘On grounds of the advent of real-time applications,like autonomous driving,visual surveillance,and sports analysis,there is an augmenting focus of attention towards Multiple-Object Tracking(MOT).The tracking-by-detection paradigm,a commonly utilized approach,connects the existing recognition hypotheses to the formerly assessed object trajectories by comparing the simila-rities of the appearance or the motion between them.For an efficient detection and tracking of the numerous objects in a complex environment,a Pearson Simi-larity-centred Kuhn-Munkres(PS-KM)algorithm was proposed in the present study.In this light,the input videos were,initially,gathered from the MOT dataset and converted into frames.The background subtraction occurred whichfiltered the inappropriate data concerning the frames after the frame conversion stage.Then,the extraction of features from the frames was executed.Afterwards,the higher dimensional features were transformed into lower-dimensional features,and feature reduction process was performed with the aid of Information Gain-centred Singular Value Decomposition(IG-SVD).Next,using the Modified Recurrent Neural Network(MRNN)method,classification was executed which identified the categories of the objects additionally.The PS-KM algorithm identi-fied that the recognized objects were tracked.Finally,the experimental outcomes exhibited that numerous targets were precisely tracked by the proposed system with 97%accuracy with a low false positive rate(FPR)of 2.3%.It was also proved that the present techniques viz.RNN,CNN,and KNN,were effective with regard to the existing models.
文摘The amount of needed control messages in wireless sensor networks(WSN)is affected by the storage strategy of detected events.Because broadcasting superfluous control messages consumes excess energy,the network lifespan can be extended if the quantity of control messages is decreased.In this study,an optimized storage technique having low control overhead for tracking the objects in WSN is introduced.The basic concept is to retain observed events in internal memory and preserve the relationship between sensed information and sensor nodes using a novel inexpensive data structure entitled Ordered Binary Linked List(OBLL).Whenever an object passes over the sensor area,the recognizing sensor can immediately produce an OBLL along the object’s route.To retrieve the entire information,the OBLL can be traversed with logarithmic complexity which is much less than the traversing complexity of existing linked list structures.Performance evaluation and simulations were carried out to ensure that the suggested technique minimizes the number of messages and thus saving energy and extending the network life.
文摘An improved estimation of motion vectors of feature points is proposed for tracking moving objects of dynamic image sequence. Feature points are firstly extracted by the improved minimum intensity change (MIC) algorithm. The matching points of these feature points are then determined by adaptive rood pattern searching. Based on the random sample consensus (RANSAC) method, the background motion is finally compensated by the parameters of an affine transform of the background motion. With reasonable morphological filtering, the moving objects are completely extracted from the background, and then tracked accurately. Experimental results show that the improved method is successful on the motion background compensation and offers great promise in tracking moving objects of the dynamic image sequence.
基金supported in part by the Institute for Guo Qiang of Tsinghua University(2019GQG1023)in part by Graduate Education and Teaching Reform Project of Tsinghua University(202007J007)+1 种基金in part by National Natural Science Foundation of China(U19B2029,62073028,61803222)in part by the Independent Research Program of Tsinghua University(2018Z05JDX002)。
文摘There are two main trends in the development of unmanned aerial vehicle(UAV)technologies:miniaturization and intellectualization,in which realizing object tracking capabilities for a nano-scale UAV is one of the most challenging problems.In this paper,we present a visual object tracking and servoing control system utilizing a tailor-made 38 g nano-scale quadrotor.A lightweight visual module is integrated to enable object tracking capabilities,and a micro positioning deck is mounted to provide accurate pose estimation.In order to be robust against object appearance variations,a novel object tracking algorithm,denoted by RMCTer,is proposed,which integrates a powerful short-term tracking module and an efficient long-term processing module.In particular,the long-term processing module can provide additional object information and modify the short-term tracking model in a timely manner.Furthermore,a positionbased visual servoing control method is proposed for the quadrotor,where an adaptive tracking controller is designed by leveraging backstepping and adaptive techniques.Stable and accurate object tracking is achieved even under disturbances.Experimental results are presented to demonstrate the high accuracy and stability of the whole tracking system.
基金supported by the National Natural Science Foundation of China(Grant No.51009040)Heilongjiang Postdoctoral Fund(Grant No.LBH-Z11205)+1 种基金the National High Technology Research and Development Program of China(863 Program,Grant No.2011AA09A106)the China Postdoctoral Science Foundation(Grant No.2012M510928)
文摘This paper describes a new framework for object detection and tracking of AUV including underwater acoustic data interpolation, underwater acoustic images segmentation and underwater objects tracking. This framework is applied to the design of vision-based method for AUV based on the forward looking sonar sensor. First, the real-time data flow (underwater acoustic images) is pre-processed to form the whole underwater acoustic image, and the relevant position information of objects is extracted and determined. An improved method of double threshold segmentation is proposed to resolve the problem that the threshold cannot be adjusted adaptively in the traditional method. Second, a representation of region information is created in light of the Gaussian particle filter. The weighted integration strategy combining the area and invariant moment is proposed to perfect the weight of particles and to enhance the tracking robustness. Results obtained on the real acoustic vision platform of AUV during sea trials are displayed and discussed. They show that the proposed method can detect and track the moving objects underwater online, and it is effective and robust.
基金the Framework of International Cooperation Program managed by the National Research Foundation of Korea(2019K1A3A1A8011295711).
文摘Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It allows the deployment of smart cameras or optical sensors with computer vision techniques,which may serve in several object detection and tracking tasks.These tasks have been considered challenging and high-level perceptual problems,frequently dominated by relative information about the environment,where main concerns such as occlusion,illumination,background,object deformation,and object class variations are commonplace.In order to show the importance of top view surveillance,a collaborative robotics framework has been presented.It can assist in the detection and tracking of multiple objects in top view surveillance.The framework consists of a smart robotic camera embedded with the visual processing unit.The existing pre-trained deep learning models named SSD and YOLO has been adopted for object detection and localization.The detection models are further combined with different tracking algorithms,including GOTURN,MEDIANFLOW,TLD,KCF,MIL,and BOOSTING.These algorithms,along with detection models,help to track and predict the trajectories of detected objects.The pre-trained models are employed;therefore,the generalization performance is also investigated through testing the models on various sequences of top view data set.The detection models achieved maximum True Detection Rate 93%to 90%with a maximum 0.6%False Detection Rate.The tracking results of different algorithms are nearly identical,with tracking accuracy ranging from 90%to 94%.Furthermore,a discussion has been carried out on output results along with future guidelines.
基金supported by National Basic Research Program of China (973 Program) (No. 2006CB300407)National Natural Science Foundation of China (No. 50775017)
文摘Inspired by human behaviors, a robot object tracking model is proposed on the basis of visual attention mechanism, which is fit for the theory of topological perception. The model integrates the image-driven, bottom-up attention and the object-driven, top-down attention, whereas the previous attention model has mostly focused on either the bottom-up or top-down attention. By the bottom-up component, the whole scene is segmented into the ground region and the salient regions. Guided by top-down strategy which is achieved by a topological graph, the object regions are separated from the salient regions. The salient regions except the object regions are the barrier regions. In order to estimate the model, a mobile robot platform is developed, on which some experiments are implemented. The experimental results indicate that processing an image with a resolution of 752 × 480 pixels takes less than 200 ms and the object regions are unabridged. The analysis obtained by comparing the proposed model with the existing model demonstrates that the proposed model has some advantages in robot object tracking in terms of speed and efficiency.
文摘A method for moving object recognition and tracking in the intelligent traffic monitoring system is presented. For the shortcomings and deficiencies of the frame-subtraction method, a redundant discrete wavelet transform (RDWT) based moving object recognition algorithm is put forward, which directly detects moving objects in the redundant discrete wavelet transform domain. An improved adaptive mean-shift algorithm is used to track the moving object in the follow up frames. Experimental results show that the algorithm can effectively extract the moving object, even though the object is similar to the background, and the results are better than the traditional frame-subtraction method. The object tracking is accurate without the impact of changes in the size of the object. Therefore the algorithm has a certain practical value and prospect.
基金supported by the National Natural Science Foundation of China(60835004 60775047+2 种基金 60872130)the National High Technology Research and Development Program of China(863 Program)(2007AA04Z244 2008AA04Z214)
文摘An object model-based tracking method is useful for tracking multiple objects, but the main difficulties are modeling objects reliably and tracking objects via models in successive frames. An effective tracking method using the object models is proposed to track multiple objects in a real-time visual surveillance system. Firstly, for detecting objects, an adaptive kernel density estimation method is utilized, which uses an adaptive bandwidth and features combining colour and gradient. Secondly, some models of objects are built for describing motion, shape and colour features. Then, a matching matrix is formed to analyze tracking situations. If objects are tracked under occlusions, the optimal "visual" object is found to represent the occluded object, and the posterior probability of pixel is used to determine which pixel is utilized for updating object models. Extensive experiments show that this method improves the accuracy and validity of tracking objects even under occlusions and is used in real-time visual surveillance systems.
基金supported by the Zhejiang Key Laboratory of General Aviation Operation Technology(No.JDGA2020-7)the National Natural Science Foundation of China(No.62173237)+3 种基金the Natural Science Foundation of Liaoning Province(No.2019-MS-251)the Talent Project of Revitalization Liaoning Province(No.XLYC1907022)the Key R&D Projects of Liaoning Province(No.2020JH2/10100045)the High-Level Innovation Talent Project of Shenyang(No.RC190030).
文摘Single object tracking based on deep learning has achieved the advanced performance in many applications of computer vision.However,the existing trackers have certain limitations owing to deformation,occlusion,movement and some other conditions.We propose a siamese attentional dense network called SiamADN in an end-to-end offline manner,especially aiming at unmanned aerial vehicle(UAV)tracking.First,it applies a dense network to reduce vanishing-gradient,which strengthens the features transfer.Second,the channel attention mechanism is involved into the Densenet structure,in order to focus on the possible key regions.The advance corner detection network is introduced to improve the following tracking process.Extensive experiments are carried out on four mainly tracking benchmarks as OTB-2015,UAV123,LaSOT and VOT.The accuracy rate on UAV123 is 78.9%,and the running speed is 32 frame per second(FPS),which demonstrates its efficiency in the practical real application.
基金supported by National Natural Science Foundationof China (Grant No. 51874300)the National Natural Science Foundation of China andShanxi Provincial People’s Government Jointly Funded Project of China for Coal Baseand Low Carbon (Grant No. U1510115)+2 种基金National Natural Science Foundation of China(51104157)the Qing Lan Project, the China Postdoctoral Science Foundation (Grant No.2013T60574)the Scientific Instrument Developing Project of the Chinese Academy ofSciences (Grant No. YJKYYQ20170074).
文摘Video object tracking is an important research topic of computer vision, whichfinds a wide range of applications in video surveillance, robotics, human-computerinteraction and so on. Although many moving object tracking algorithms have beenproposed, there are still many difficulties in the actual tracking process, such asillumination change, occlusion, motion blurring, scale change, self-change and so on.Therefore, the development of object tracking technology is still challenging. Theemergence of deep learning theory and method provides a new opportunity for theresearch of object tracking, and it is also the main theoretical framework for the researchof moving object tracking algorithm in this paper. In this paper, the existing deeptracking-based target tracking algorithms are classified and sorted out. Based on theprevious knowledge and my own understanding, several solutions are proposed for theexisting methods. In addition, the existing deep learning target tracking method is stilldifficult to meet the requirements of real-time, how to design the network and trackingprocess to achieve speed and effect improvement, there is still a lot of research space.
基金Supported by the National Natural Science Foundation of China (No. 60672094)
文摘If a somewhat fast moving object exists in a complicated tracking environment, snake's nodes may fall into the inaccurate local minima. We propose a mean shift snake algorithm to solve this problem. However, if the object goes beyond the limits of mean shift snake module operation in suc- cessive sequences, mean shift snake's nodes may also fall into the local minima in their moving to the new object position. This paper presents a motion compensation strategy by using particle filter; therefore a new Particle Filter Mean Shift Snake (PFMSS) algorithm is proposed which combines particle filter with mean shift snake to fulfill the estimation of the fast moving object contour. Firstly, the fast moving object is tracked by particle filter to create a coarse position which is used to initialize the mean shift algorithm. Secondly, the whole relevant motion information is used to compensate the snake's node positions. Finally, snake algorithm is used to extract the exact object contour and the useful information of the object is fed back. Some real world sequences are tested and the results show that the novel tracking method have a good performance with high accuracy in solving the fast moving problems in cluttered background.