判别稀疏表示鲁棒快速视觉跟踪

Robust and fast visual tracking using discriminative sparse representation

导出

摘要目的 L1跟踪对局部遮挡具有良好的鲁棒性,但存在易产生模型漂移和计算速度慢的问题。针对这两个问题,该文提出了一种基于判别稀疏表示的视觉跟踪方法。方法考虑到背景和遮挡信息的干扰,提出了一种判别稀疏表示模型,并基于块坐标优化原理,采用学习迭代收缩阈值算法和软阈值操作设计出了表示模型的快速求解算法。结果在8组图像序列中,该文方法与现有的4种经典跟踪方法分别在鲁棒性和稀疏表示的计算时间方面进行了比较。在鲁棒性的定性和定量比较实验中,该文方法不仅表现出了对跟踪过程中的多种干扰因素具有良好的适应能力,而且在位置误差阈值从0~50像素的变化过程中,其精度曲线均优于实验中的其他方法;在稀疏表示的计算时间方面,在采用大小为16×16和32×32的模板进行跟踪时,该文算法的时间消耗分别为0.152 s和0.257 s,其时效性明显优于实验中的其他方法。结论与经典的跟踪方法相比,该文方法能够在克服遮挡、背景干扰和外观改变等诸多不良因素的同时,实现快速目标跟踪。由于该文方法不仅具有较优的稀疏表示计算速度,而且能够克服多种影响跟踪鲁棒性的干扰因素,因此可以将其应用于视频监控和体育竞技等实际场合。 Objective Visual tracking is an important field in computer vision and is applied in various domains. Although numerous visual tracking methods have been developed in the past several decades, many challenging issues （e. g. , occlu- sions, illumination changes, and background clutter） still affect the tracking performance of these methods. Inspired by sparse representation applied in face recognition, the LI tracker based on sparse representation was proposed by Mei et al. The L1 tracker has good robustness toward partial occlusion but is prone to model drift and time consuming. To address these two problems, this study proposes a tracking method based on discriminative sparse representation. Method Consid- ering the interference of background and occlusion information, a discriminative sparse representation model is proposed. The proposed model uses the sparseness of the coefficients associated with target and background templates so that the can- didate targets can be represented accurately. The sparseness of the coefficients associated with trivial templates makes the proposed tracker robust to partial occlusion. By using the coefficients associated with trivial and target templates, the obser-vation likelihood model, which is adopted in this study, eliminates the interference of the background information and leads to improved tracking results. A fast sparse representation algorithm is designed to increase the tracking speed and used to calculate the coefficients of the discriminative sparse representation model. At the first stage, the proposed algorithm uses the learned iterative shrinkage and thresholding algorithm （LISTA） to calculate the coefficients associated with target tem- plates. At the second stage, the proposed algorithm uses the soft shrinkage operator to calculate the coefficients associated with trivial templates. Based on block coordinate optimization theory, the above optimization procedure is iteratively used to obtain excellent sparse representation coefficients. Under the particle filter framework, the tracking task is accomplished with the proposed model and the fast solution algorithm. Result The proposed tracker is tested on eight sequences, namely, FaceOccl, FaceOcc2, David3, Dudek, Singerl, Card-, Jumping, and CarDark. The strength of the proposed tracker is an- alyzed by comparing the proposed tracker with L1, L1 APG （L1 tracker based on accelerated proximal gradient） , SP, and L1L2 trackers. The issues in these sequences include occlusion, in-plane rotation, out-plane rotation, target appearance variations, illumination changes, camera motion, scale changes, motion blur, and background clutter. The selected state- of-the-art trackers, which are used to demonstrate the effectiveness of the proposed tracker, are all based on sparse repre- sentation. L1APG tracker, SP tracker, and L1 L2 tracker are improvements of the L1 tracker. For robustness evaluation, qualitative and quantitative experiments are conducted to evaluate the proposed tracker. The qualitative comparison shows that the proposed tracker overcomes various challenging issues during tracking. For the quantitative comparison, a precision plot is used to analyze the performance of the proposed tracker. With the location threshold varying from 0 to 50 pixels, the precision plot of the proposed tracker is better than that of the others in the eight sequences. In terms of computing speed, the proposed algorithm can significantly reduce the computational cost of sparse representation. The time for solving an im- age patch is 0. 152 s and 0. 257 s for patches with resolutions of 16 x 16 and 32 x 32, respectively. The proposed tracker consumes less time than the others in the experiment. Compared with other trackers that do not adopt background templates to construct the sparse representation model, the proposed tracker produces better tracking results. Conclusion The pro- posed tracker is more robust to occlusion and other challenges, such as background clutter and appearance changes, and has better tracking speed than the state-of-the-art trackers. Thus, trackers based on the proposed method can be used for many engineering applications, such as video surveillance, medical diagnosis, and athletics. The adopted method, which is used to update the target templates, has low time consumption but may sometimes bring some interference information to the trackers. Thus, a more effective method of updating target templates needs to be developed in the future.

作者刘文琢袁广林薛模根

机构地区陆军军官学院偏振光成像探测技术安徽省重点实验室陆军军官学院十一系

出处《中国图象图形学报》 CSCD 北大核心 2017年第6期815-823,共9页 Journal of Image and Graphics

基金国家自然科学基金项目(61175035 61379105)~~

关键词机器视觉目标跟踪判别稀疏表示前馈神经网络粒子滤波 machine vision target tracking discriminative sparse representation feed-forward neural network particle filter

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]