"视觉词袋"(Bag of Visual Words,BOV)算法是一种有效的基于语义特征表达的物体识别算法。针对传统BOV模型存在的不足,综合利用SAR图像的灰度和纹理特征,提出基于感兴趣目标(Target of Interest,TOI)的"视觉词袋"..."视觉词袋"(Bag of Visual Words,BOV)算法是一种有效的基于语义特征表达的物体识别算法。针对传统BOV模型存在的不足,综合利用SAR图像的灰度和纹理特征,提出基于感兴趣目标(Target of Interest,TOI)的"视觉词袋"算法。首先,对训练图像进行TOI选取,用灰度共生矩阵模型提取TOI的纹理特征,再结合灰度特征,组成多维特征向量集,以簇内相似度最高、数据分布密度最大为准则,生成"视觉词袋"。其次,对测试图像,依据已生成的"视觉词袋",采用支持向量机(Support Vector Machine,SVM)分类器,实现SAR图像感兴趣目标的有效分类。实验结果表明,与传统的"视觉词袋"构建算法相比,该算法在分类正确率提高的同时,能够在训练图像较少的情况下达到良好的分类效果。展开更多
In this paper, the Kalman filter is used to predict image feature positionaround which an image-processing window is then established to diminish feature-searching area andto heighten the image-processing speed. Accor...In this paper, the Kalman filter is used to predict image feature positionaround which an image-processing window is then established to diminish feature-searching area andto heighten the image-processing speed. According to the fundamentals of image-based visual servoing(IBVS), the cerebellar model articulation controller (CMAC) neural network is inserted into thevisual servo control loop to implement the nonlinear mapping from the error signal in the imagespace to the control signal in the input space instead of the iterative adjustment and complicatedinverse solution of the image Jacobian. Simulation results show that the feature point can bepredicted efficiently using the Kalman filter and on-line supervised learning can be realized usingCMAC neural network; end-effector can track the target object very well.展开更多
Visualization methods for single documents are either too simple, considering word frequency only, or depend on syntactic and semantic information bases to be more useful. This paper presents an intermediary approach,...Visualization methods for single documents are either too simple, considering word frequency only, or depend on syntactic and semantic information bases to be more useful. This paper presents an intermediary approach, based on H. P. Luhn’s automatic abstract creation algorithm, and intends to aggregate more information to document visualization than word counting methods do without the need of external sources. The method takes pairs of relevant words and computes the linkage force between them. Relevant words become vertices and links become edges in the resulting graph.展开更多
In recent visual tracking research,correlation filter(CF)based trackers become popular because of their high speed and considerable accuracy.Previous methods mainly work on the extension of features and the solution o...In recent visual tracking research,correlation filter(CF)based trackers become popular because of their high speed and considerable accuracy.Previous methods mainly work on the extension of features and the solution of the boundary effect to learn a better correlation filter.However,the related studies are insufficient.By exploring the potential of trackers in these two aspects,a novel adaptive padding correlation filter(APCF)with feature group fusion is proposed for robust visual tracking in this paper based on the popular context-aware tracking framework.In the tracker,three feature groups are fused by use of the weighted sum of the normalized response maps,to alleviate the risk of drift caused by the extreme change of single feature.Moreover,to improve the adaptive ability of padding for the filter training of different object shapes,the best padding is selected from the preset pool according to tracking precision over the whole video,where tracking precision is predicted according to the prediction model trained by use of the sequence features of the first several frames.The sequence features include three traditional features and eight newly constructed features.Extensive experiments demonstrate that the proposed tracker is superior to most state-of-the-art correlation filter based trackers and has a stable improvement compared to the basic trackers.展开更多
The generic Meanshift is susceptible to interference of background pixels with the target pixels in the kernel of the reference model, which compromises the tracking performance. In this paper, we enhance the target c...The generic Meanshift is susceptible to interference of background pixels with the target pixels in the kernel of the reference model, which compromises the tracking performance. In this paper, we enhance the target color feature by attenuating the background color within the kernel through enlarging the pixel weightings which map to the pixels on the target. This way, the background pixel interference is largely suppressed in the color histogram in the course of constructing the target reference model. In addition, the proposed method also reduces the number of Meanshift iterations, which speeds up the algorithmic convergence. The two tests validate the proposed approach with improved tracking robustness on real-world video sequences.展开更多
Most sensors or cameras discussed in the sensor network community are usually 3D homogeneous, even though their2 D coverage areas in the ground plane are heterogeneous. Meanwhile, observed objects of camera networks a...Most sensors or cameras discussed in the sensor network community are usually 3D homogeneous, even though their2 D coverage areas in the ground plane are heterogeneous. Meanwhile, observed objects of camera networks are usually simplified as 2D points in previous literature. However in actual application scenes, not only cameras are always heterogeneous with different height and action radiuses, but also the observed objects are with 3D features(i.e., height). This paper presents a sensor planning formulation addressing the efficiency enhancement of visual tracking in 3D heterogeneous camera networks that track and detect people traversing a region. The problem of sensor planning consists of three issues:(i) how to model the 3D heterogeneous cameras;(ii) how to rank the visibility, which ensures that the object of interest is visible in a camera's field of view;(iii) how to reconfigure the 3D viewing orientations of the cameras. This paper studies the geometric properties of 3D heterogeneous camera networks and addresses an evaluation formulation to rank the visibility of observed objects. Then a sensor planning method is proposed to improve the efficiency of visual tracking. Finally, the numerical results show that the proposed method can improve the tracking performance of the system compared to the conventional strategies.展开更多
According to the main tools of TRIZ, the theory of inventive problem solving, a new flowchart of the product conceptual design process to solve contradiction in TRIZ is proposed. In order to realize autonomous moving ...According to the main tools of TRIZ, the theory of inventive problem solving, a new flowchart of the product conceptual design process to solve contradiction in TRIZ is proposed. In order to realize autonomous moving and automatic weld seam tracking for welding robot in Tailed Welded Blanks, a creative design of robotic visual tracking system bused on CMOS has been developed by using the flowchart. The new system is not only used to inspect the workpiece ahead of a welding torch and measure the joint orientation and lateral deviation caused by curvature or discontinuity in the joint part, but also to record and measure the image size of the weld pool. Moreover, the hardware and software components are discussed in brief.展开更多
To improve the reliability and accuracy of visual tracker,a robust visual tracking algorithm based on multi-cues fusion under Bayesian framework is proposed.The weighed color and texture cues of the object are applied...To improve the reliability and accuracy of visual tracker,a robust visual tracking algorithm based on multi-cues fusion under Bayesian framework is proposed.The weighed color and texture cues of the object are applied to describe the moving object.An adjustable observation model is incorporated into particle filtering,which utilizes the properties of particle filter for coping with non-linear,non-Gaussian assumption and the ability to predict the position of the moving object in a cluttered environment and two complementary attributes are employed to estimate the matching similarity dynamically in term of the likelihood ratio factors;furthermore tunes the weight values according to the confidence map of the color and texture feature on-line adaptively to reconfigure the optimal observation likelihood model,which ensured attaining the maximum likelihood ratio in the tracking scenario even if in the situations where the object is occluded or illumination,pose and scale are time-variant.The experimental result shows that the algorithm can track a moving object accurately while the reliability of tracking in a challenging case is validated in the experimentation.展开更多
This paper introduces an approach for visual tracking of multi-target with occlusion occurrence. Based on the author's previous work in which the Overlap Coefficient (OC) is used to detect the occlusion, in this p...This paper introduces an approach for visual tracking of multi-target with occlusion occurrence. Based on the author's previous work in which the Overlap Coefficient (OC) is used to detect the occlusion, in this paper a method of combining Bhattacharyya Coefficient (BC) and Kalman filter innovation term is proposed as the criteria for jointly detecting the occlusion occurrence. Fragmentation of target is introduced in order to closely monitor the occlusion development. In the course of occlusion, the Kalman predictor is applied to determine the location of the occluded target, and the criterion for checking the re-appearance of the occluded target is also presented. The proposed approach is put to test on a standard video sequence, suggesting the satisfactory performance in multi-target tracking.展开更多
To tackle the problem of severe occlusions in visual tracking, we propose a hierarchical template-matching method based on a layered appearance model. This model integrates holistic- and part-region matching in order ...To tackle the problem of severe occlusions in visual tracking, we propose a hierarchical template-matching method based on a layered appearance model. This model integrates holistic- and part-region matching in order to locate an object in a coarse-to-fine manner. Furthermore, in order to reduce ambiguity in object localization, only the discriminative parts of an object' s appearance template are chosen for similarity computing with respect to their cornerness measurements. The similarity between parts is computed in a layer-wise manner, and from this, occlusions can be evaluated. When the object is partly occluded, it can be located accurately by matching candidate regions with the appearance template. When it is completely occluded, its location can be predicted from its historical motion information using a Kalman filter. The proposed tracker is tested on several practical image sequences, and the experimental results show that it can consistently provide accurate object location for stable tracking, even for severe occlusions.展开更多
The colour feature is often used in the object tracking.The tracking methods extract the colour features of the object and the background,and distinguish them by a classifier.However,these existing methods simply use ...The colour feature is often used in the object tracking.The tracking methods extract the colour features of the object and the background,and distinguish them by a classifier.However,these existing methods simply use the colour information of the target pixels and do not consider the shape feature of the target,so that the description capability of the feature is weak.Moreover,incorporating shape information often leads to large feature dimension,which is not conducive to real-time object tracking.Recently,the emergence of visual tracking methods based on deep learning has also greatly increased the demand for computing resources of the algorithm.In this paper,we propose a real-time visual tracking method with compact shape and colour feature,which forms low dimensional compact shape and colour feature by fusing the shape and colour characteristics of the candidate object region,and reduces the dimensionality of the combined feature through the Hash function.The structural classification function is trained and updated online with dynamic data flow for adapting to the new frames.Further,the classification and prediction of the object are carried out with structured classification function.The experimental results demonstrate that the proposed tracker performs superiorly against several state-of-the-art algorithms on the challenging benchmark dataset OTB-100 and OTB-13.展开更多
Visual tracking is a classical computer vision problem with many applications.Efficient convolution operators(ECO)is one of the most outstanding visual tracking algorithms in recent years,it has shown great performanc...Visual tracking is a classical computer vision problem with many applications.Efficient convolution operators(ECO)is one of the most outstanding visual tracking algorithms in recent years,it has shown great performance using discriminative correlation filter(DCF)together with HOG,color maps and VGGNet features.Inspired by new deep learning models,this paper propose a hybrid efficient convolution operators integrating fully convolution network(FCN)and residual network(ResNet)for visual tracking,where FCN and ResNet are introduced in our proposed method to segment the objects from backgrounds and extract hierarchical feature maps of objects,respectively.Compared with the traditional VGGNet,our approach has higher accuracy for dealing with the issues of segmentation and image size.The experiments show that our approach would obtain better performance than ECO in terms of precision plot and success rate plot on OTB-2013 and UAV123 datasets.展开更多
This paper addresses the robust visual tracking of multi-feature points for a 3D manipulator with unknown intrinsic and extrinsic parameters of the vision system. This class of control systems are highly nonlinear con...This paper addresses the robust visual tracking of multi-feature points for a 3D manipulator with unknown intrinsic and extrinsic parameters of the vision system. This class of control systems are highly nonlinear control systems characterized as time-varying and strong coupling in states and unknown parameters. It is first pointed out that not only is the Jacobian image matrix nonsingular, but also its minimum singular value has a positive limit. This provides the foundation of kinematics and dynamics control of manipulators with visual feedback. Second, the Euler angle expressed rotation transformation is employed to estimate a subspace of the parameter space of the vision system. Based on the two results above, and arbitrarily chosen parameters in this subspace, the tracking controllers are proposed so that the image errors can be made as small as desired so long as the control gain is allowed to be large. The controller does not use visual velocity to achieve high and robust performance with low sampling rate of the vision system. The obtained results are proved by Lyapunov direct method. Experiments are included to demonstrate the effectiveness of the proposed controller.展开更多
To solve the problem of low robustness of trackers under significant appearance changes in complex background,a novel moving target tracking method based on hierarchical deep features weighted fusion and correlation f...To solve the problem of low robustness of trackers under significant appearance changes in complex background,a novel moving target tracking method based on hierarchical deep features weighted fusion and correlation filter is proposed.Firstly,multi-layer features are extracted by a deep model pre-trained on massive object recognition datasets.The linearly separable features of Relu3-1,Relu4-1 and Relu5-4 layers from VGG-Net-19 are especially suitable for target tracking.Then,correlation filters over hierarchical convolutional features are learned to generate their correlation response maps.Finally,a novel approach of weight adjustment is presented to fuse response maps.The maximum value of the final response map is just the location of the target.Extensive experiments on the object tracking benchmark datasets demonstrate the high robustness and recognition precision compared with several state-of-the-art trackers under the different conditions.展开更多
Visual tracking has been widely applied in construction industry and attracted signifi-cant interests recently. Lots of research studies have adopted visual tracking techniques on the surveillance of construction work...Visual tracking has been widely applied in construction industry and attracted signifi-cant interests recently. Lots of research studies have adopted visual tracking techniques on the surveillance of construction workforce, project productivity and construction safety. Until now, visual tracking algorithms have gained promising performance when tracking un-articulated equipment in construction sites. However, state-of-art tracking algorithms have unguaranteed performance in tracking articulated equipment, such as backhoes and excavators. The stretching buckets and booms are the main obstacles of successfully tracking articulated equipment. In order to fill this knowledge gap, the part-based tracking algorithms are introduced in this paper for tracking articulated equipment in construction sites. The part-based tracking is able to track different parts of target equipment while using multiple tracking algorithms at the same sequence. Some existing tracking methods have been chosen according to their outstanding performance in the computer vision community. Then, the part-based algorithms were created on the basis of selected visual tracking methods and tested by real construction sequences. In this way, the tracking performance was evaluated from effectiveness and robustness aspects. Throughout the quantification analysis, the tracking performance of articulated equipment was much more improved by using the part-based tracking algorithms.展开更多
The 3D object visual tracking problem is studied for the robot vision system of the 220kV/330kV high-voltage live-line insulator cleaning robot. The SUSAN Edge based Scale Invariant Feature (SESIF) algorithm based 3D ...The 3D object visual tracking problem is studied for the robot vision system of the 220kV/330kV high-voltage live-line insulator cleaning robot. The SUSAN Edge based Scale Invariant Feature (SESIF) algorithm based 3D objects visual tracking is achieved in three stages: the first frame stage,tracking stage,and recovering stage. An SESIF based objects recognition algorithm is proposed to find initial location at both the first frame stage and recovering stage. An SESIF and Lie group based visual tracking algorithm is used to track 3D object. Experiments verify the algorithm's robustness. This algorithm will be used in the second generation of the 220kV/330kV high-voltage live-line insulator cleaning robot.展开更多
提出了一种利用"bag of words"模型对视频内容进行建模和匹配的方法。通过量化视频帧的局部特征构建视觉关键词(visual words)辞典,将视频的子镜头表示成若干视觉关键词的集合。在此基础上构建基于子镜头的视觉关键词词组的...提出了一种利用"bag of words"模型对视频内容进行建模和匹配的方法。通过量化视频帧的局部特征构建视觉关键词(visual words)辞典,将视频的子镜头表示成若干视觉关键词的集合。在此基础上构建基于子镜头的视觉关键词词组的倒排索引,用于视频片段的匹配和检索。这种方法保留了局部特征的显著性及其相对位置关系,而且有效地压缩了视频的表达,加速的视频的匹配和检索过程。实验结果表明,和已有方法相比,基于"bag of words"的视频匹配方法在大视频样本库上获得了更高的检索精度和检索速度。展开更多
Bag of Words算法是一种有效的基于语义特征提取与表达的物体识别算法,算法充分学习文本检索算法的优点,将图片整理为一系列视觉词汇的集合,提取物体的语义特征,实现感兴趣物体的有效检测与识别。文章主要研究了Bagof Words算法的框架...Bag of Words算法是一种有效的基于语义特征提取与表达的物体识别算法,算法充分学习文本检索算法的优点,将图片整理为一系列视觉词汇的集合,提取物体的语义特征,实现感兴趣物体的有效检测与识别。文章主要研究了Bagof Words算法的框架和基本内容。展开更多
文摘"视觉词袋"(Bag of Visual Words,BOV)算法是一种有效的基于语义特征表达的物体识别算法。针对传统BOV模型存在的不足,综合利用SAR图像的灰度和纹理特征,提出基于感兴趣目标(Target of Interest,TOI)的"视觉词袋"算法。首先,对训练图像进行TOI选取,用灰度共生矩阵模型提取TOI的纹理特征,再结合灰度特征,组成多维特征向量集,以簇内相似度最高、数据分布密度最大为准则,生成"视觉词袋"。其次,对测试图像,依据已生成的"视觉词袋",采用支持向量机(Support Vector Machine,SVM)分类器,实现SAR图像感兴趣目标的有效分类。实验结果表明,与传统的"视觉词袋"构建算法相比,该算法在分类正确率提高的同时,能够在训练图像较少的情况下达到良好的分类效果。
基金The National Natural Science Foundation of China (59990470).
文摘In this paper, the Kalman filter is used to predict image feature positionaround which an image-processing window is then established to diminish feature-searching area andto heighten the image-processing speed. According to the fundamentals of image-based visual servoing(IBVS), the cerebellar model articulation controller (CMAC) neural network is inserted into thevisual servo control loop to implement the nonlinear mapping from the error signal in the imagespace to the control signal in the input space instead of the iterative adjustment and complicatedinverse solution of the image Jacobian. Simulation results show that the feature point can bepredicted efficiently using the Kalman filter and on-line supervised learning can be realized usingCMAC neural network; end-effector can track the target object very well.
文摘Visualization methods for single documents are either too simple, considering word frequency only, or depend on syntactic and semantic information bases to be more useful. This paper presents an intermediary approach, based on H. P. Luhn’s automatic abstract creation algorithm, and intends to aggregate more information to document visualization than word counting methods do without the need of external sources. The method takes pairs of relevant words and computes the linkage force between them. Relevant words become vertices and links become edges in the resulting graph.
基金supported by the National KeyResearch and Development Program of China(2018AAA0103203)the National Natural Science Foundation of China(62073036,62076031)the Beijing Natural Science Foundation(4202071)。
文摘In recent visual tracking research,correlation filter(CF)based trackers become popular because of their high speed and considerable accuracy.Previous methods mainly work on the extension of features and the solution of the boundary effect to learn a better correlation filter.However,the related studies are insufficient.By exploring the potential of trackers in these two aspects,a novel adaptive padding correlation filter(APCF)with feature group fusion is proposed for robust visual tracking in this paper based on the popular context-aware tracking framework.In the tracker,three feature groups are fused by use of the weighted sum of the normalized response maps,to alleviate the risk of drift caused by the extreme change of single feature.Moreover,to improve the adaptive ability of padding for the filter training of different object shapes,the best padding is selected from the preset pool according to tracking precision over the whole video,where tracking precision is predicted according to the prediction model trained by use of the sequence features of the first several frames.The sequence features include three traditional features and eight newly constructed features.Extensive experiments demonstrate that the proposed tracker is superior to most state-of-the-art correlation filter based trackers and has a stable improvement compared to the basic trackers.
基金Supported by the Program for Technology Innovation Team of Ningbo Government (No. 2011B81002)the Ningbo University Science Research Foundation (No.xkl11075)
文摘The generic Meanshift is susceptible to interference of background pixels with the target pixels in the kernel of the reference model, which compromises the tracking performance. In this paper, we enhance the target color feature by attenuating the background color within the kernel through enlarging the pixel weightings which map to the pixels on the target. This way, the background pixel interference is largely suppressed in the color histogram in the course of constructing the target reference model. In addition, the proposed method also reduces the number of Meanshift iterations, which speeds up the algorithmic convergence. The two tests validate the proposed approach with improved tracking robustness on real-world video sequences.
基金supported by the National Natural Science Foundationof China(61100207)the National Key Technology Research and Development Program of the Ministry of Science and Technology of China(2014BAK14B03)+1 种基金the Fundamental Research Funds for the Central Universities(2013PT132013XZ12)
文摘Most sensors or cameras discussed in the sensor network community are usually 3D homogeneous, even though their2 D coverage areas in the ground plane are heterogeneous. Meanwhile, observed objects of camera networks are usually simplified as 2D points in previous literature. However in actual application scenes, not only cameras are always heterogeneous with different height and action radiuses, but also the observed objects are with 3D features(i.e., height). This paper presents a sensor planning formulation addressing the efficiency enhancement of visual tracking in 3D heterogeneous camera networks that track and detect people traversing a region. The problem of sensor planning consists of three issues:(i) how to model the 3D heterogeneous cameras;(ii) how to rank the visibility, which ensures that the object of interest is visible in a camera's field of view;(iii) how to reconfigure the 3D viewing orientations of the cameras. This paper studies the geometric properties of 3D heterogeneous camera networks and addresses an evaluation formulation to rank the visibility of observed objects. Then a sensor planning method is proposed to improve the efficiency of visual tracking. Finally, the numerical results show that the proposed method can improve the tracking performance of the system compared to the conventional strategies.
文摘According to the main tools of TRIZ, the theory of inventive problem solving, a new flowchart of the product conceptual design process to solve contradiction in TRIZ is proposed. In order to realize autonomous moving and automatic weld seam tracking for welding robot in Tailed Welded Blanks, a creative design of robotic visual tracking system bused on CMOS has been developed by using the flowchart. The new system is not only used to inspect the workpiece ahead of a welding torch and measure the joint orientation and lateral deviation caused by curvature or discontinuity in the joint part, but also to record and measure the image size of the weld pool. Moreover, the hardware and software components are discussed in brief.
文摘To improve the reliability and accuracy of visual tracker,a robust visual tracking algorithm based on multi-cues fusion under Bayesian framework is proposed.The weighed color and texture cues of the object are applied to describe the moving object.An adjustable observation model is incorporated into particle filtering,which utilizes the properties of particle filter for coping with non-linear,non-Gaussian assumption and the ability to predict the position of the moving object in a cluttered environment and two complementary attributes are employed to estimate the matching similarity dynamically in term of the likelihood ratio factors;furthermore tunes the weight values according to the confidence map of the color and texture feature on-line adaptively to reconfigure the optimal observation likelihood model,which ensured attaining the maximum likelihood ratio in the tracking scenario even if in the situations where the object is occluded or illumination,pose and scale are time-variant.The experimental result shows that the algorithm can track a moving object accurately while the reliability of tracking in a challenging case is validated in the experimentation.
基金Supported by the Program for Technology Innovation Team of Ningbo Government (No. 2011B81002)the Ningbo University Science Research Foundation (No.xkl11075)
文摘This paper introduces an approach for visual tracking of multi-target with occlusion occurrence. Based on the author's previous work in which the Overlap Coefficient (OC) is used to detect the occlusion, in this paper a method of combining Bhattacharyya Coefficient (BC) and Kalman filter innovation term is proposed as the criteria for jointly detecting the occlusion occurrence. Fragmentation of target is introduced in order to closely monitor the occlusion development. In the course of occlusion, the Kalman predictor is applied to determine the location of the occluded target, and the criterion for checking the re-appearance of the occluded target is also presented. The proposed approach is put to test on a standard video sequence, suggesting the satisfactory performance in multi-target tracking.
基金supported by the Aeronautical Science Foundation of China under Grant 20115169016supported in part by the technique cooperation project of ZTE on Intelligent Video Analysis in 2012
文摘To tackle the problem of severe occlusions in visual tracking, we propose a hierarchical template-matching method based on a layered appearance model. This model integrates holistic- and part-region matching in order to locate an object in a coarse-to-fine manner. Furthermore, in order to reduce ambiguity in object localization, only the discriminative parts of an object' s appearance template are chosen for similarity computing with respect to their cornerness measurements. The similarity between parts is computed in a layer-wise manner, and from this, occlusions can be evaluated. When the object is partly occluded, it can be located accurately by matching candidate regions with the appearance template. When it is completely occluded, its location can be predicted from its historical motion information using a Kalman filter. The proposed tracker is tested on several practical image sequences, and the experimental results show that it can consistently provide accurate object location for stable tracking, even for severe occlusions.
基金This work was supported by the National Key Research and Development Plan(No.2016YFC0600908)the National Natural Science Foundation of China(No.61772530,U1610124)+1 种基金Natural Science Foundation of Jiangsu Province of China(No.BK20171192)China Postdoctoral Science Foundation(No.2016T90524,No.2014M551696).
文摘The colour feature is often used in the object tracking.The tracking methods extract the colour features of the object and the background,and distinguish them by a classifier.However,these existing methods simply use the colour information of the target pixels and do not consider the shape feature of the target,so that the description capability of the feature is weak.Moreover,incorporating shape information often leads to large feature dimension,which is not conducive to real-time object tracking.Recently,the emergence of visual tracking methods based on deep learning has also greatly increased the demand for computing resources of the algorithm.In this paper,we propose a real-time visual tracking method with compact shape and colour feature,which forms low dimensional compact shape and colour feature by fusing the shape and colour characteristics of the candidate object region,and reduces the dimensionality of the combined feature through the Hash function.The structural classification function is trained and updated online with dynamic data flow for adapting to the new frames.Further,the classification and prediction of the object are carried out with structured classification function.The experimental results demonstrate that the proposed tracker performs superiorly against several state-of-the-art algorithms on the challenging benchmark dataset OTB-100 and OTB-13.
文摘Visual tracking is a classical computer vision problem with many applications.Efficient convolution operators(ECO)is one of the most outstanding visual tracking algorithms in recent years,it has shown great performance using discriminative correlation filter(DCF)together with HOG,color maps and VGGNet features.Inspired by new deep learning models,this paper propose a hybrid efficient convolution operators integrating fully convolution network(FCN)and residual network(ResNet)for visual tracking,where FCN and ResNet are introduced in our proposed method to segment the objects from backgrounds and extract hierarchical feature maps of objects,respectively.Compared with the traditional VGGNet,our approach has higher accuracy for dealing with the issues of segmentation and image size.The experiments show that our approach would obtain better performance than ECO in terms of precision plot and success rate plot on OTB-2013 and UAV123 datasets.
基金This work was supported by The National Science Foundation(No.60474009),Shu Guang Program(No.05SG48)Scientific Programm ofShanghai Education Committee(No.07zz90).
文摘This paper addresses the robust visual tracking of multi-feature points for a 3D manipulator with unknown intrinsic and extrinsic parameters of the vision system. This class of control systems are highly nonlinear control systems characterized as time-varying and strong coupling in states and unknown parameters. It is first pointed out that not only is the Jacobian image matrix nonsingular, but also its minimum singular value has a positive limit. This provides the foundation of kinematics and dynamics control of manipulators with visual feedback. Second, the Euler angle expressed rotation transformation is employed to estimate a subspace of the parameter space of the vision system. Based on the two results above, and arbitrarily chosen parameters in this subspace, the tracking controllers are proposed so that the image errors can be made as small as desired so long as the control gain is allowed to be large. The controller does not use visual velocity to achieve high and robust performance with low sampling rate of the vision system. The obtained results are proved by Lyapunov direct method. Experiments are included to demonstrate the effectiveness of the proposed controller.
文摘To solve the problem of low robustness of trackers under significant appearance changes in complex background,a novel moving target tracking method based on hierarchical deep features weighted fusion and correlation filter is proposed.Firstly,multi-layer features are extracted by a deep model pre-trained on massive object recognition datasets.The linearly separable features of Relu3-1,Relu4-1 and Relu5-4 layers from VGG-Net-19 are especially suitable for target tracking.Then,correlation filters over hierarchical convolutional features are learned to generate their correlation response maps.Finally,a novel approach of weight adjustment is presented to fuse response maps.The maximum value of the final response map is just the location of the target.Extensive experiments on the object tracking benchmark datasets demonstrate the high robustness and recognition precision compared with several state-of-the-art trackers under the different conditions.
文摘Visual tracking has been widely applied in construction industry and attracted signifi-cant interests recently. Lots of research studies have adopted visual tracking techniques on the surveillance of construction workforce, project productivity and construction safety. Until now, visual tracking algorithms have gained promising performance when tracking un-articulated equipment in construction sites. However, state-of-art tracking algorithms have unguaranteed performance in tracking articulated equipment, such as backhoes and excavators. The stretching buckets and booms are the main obstacles of successfully tracking articulated equipment. In order to fill this knowledge gap, the part-based tracking algorithms are introduced in this paper for tracking articulated equipment in construction sites. The part-based tracking is able to track different parts of target equipment while using multiple tracking algorithms at the same sequence. Some existing tracking methods have been chosen according to their outstanding performance in the computer vision community. Then, the part-based algorithms were created on the basis of selected visual tracking methods and tested by real construction sequences. In this way, the tracking performance was evaluated from effectiveness and robustness aspects. Throughout the quantification analysis, the tracking performance of articulated equipment was much more improved by using the part-based tracking algorithms.
基金National High Technology Research and Development Programof China (863program,No.2002AA42D110-2)
文摘The 3D object visual tracking problem is studied for the robot vision system of the 220kV/330kV high-voltage live-line insulator cleaning robot. The SUSAN Edge based Scale Invariant Feature (SESIF) algorithm based 3D objects visual tracking is achieved in three stages: the first frame stage,tracking stage,and recovering stage. An SESIF based objects recognition algorithm is proposed to find initial location at both the first frame stage and recovering stage. An SESIF and Lie group based visual tracking algorithm is used to track 3D object. Experiments verify the algorithm's robustness. This algorithm will be used in the second generation of the 220kV/330kV high-voltage live-line insulator cleaning robot.
文摘提出了一种利用"bag of words"模型对视频内容进行建模和匹配的方法。通过量化视频帧的局部特征构建视觉关键词(visual words)辞典,将视频的子镜头表示成若干视觉关键词的集合。在此基础上构建基于子镜头的视觉关键词词组的倒排索引,用于视频片段的匹配和检索。这种方法保留了局部特征的显著性及其相对位置关系,而且有效地压缩了视频的表达,加速的视频的匹配和检索过程。实验结果表明,和已有方法相比,基于"bag of words"的视频匹配方法在大视频样本库上获得了更高的检索精度和检索速度。