摘要
针对人体动作识别的应用,提出了利用可控方向滤波器和Gabor滤波器相结合的方法检测三维视频信号的兴趣点。针对三维空间中兴趣点,利用HOG/HOF作为底层特征,进行稀疏编码表示;进一步,将三维空间中的兴趣点沿时间轴投影,利用空间金字塔和Max Pooling技术,将一个视频表示成若干向量的串接表示;这时的视频表示结合了整体训练数据库中特征的分布特性,因此可以作为中层的特征表示。利用SVM分类器,在HMDB数据库上,该算法获得比基于Harris 3D滤波器的特征表示更好的效果,和近期提出的算法相比,性能有很大的提高。
For human action recognition,this paper proposes a method of combining the steerable filter with Gabor filter to detect interest points in 3D video. For interest points in 3D space, HOG/HOF is used as low level feature to implement sparse code representation. When projecting the 3D interest points into x-y plane,it is easy to combine spatial pyramid with max pooling to represent the video itself as the cascade of each vector corresponding to each location. Since this kind of representation can reflect the distribution information to some extent,it can be used as mid-level feature representation. On HMDB database,the proposed approach can achieve better effect than the feature representation of Harris 3D filter by SVM, and comparing with the approaches proposed recently,it has higher improvement.
出处
《无线电通信技术》
2013年第5期81-84,共4页
Radio Communications Technology