摘要
编码视频数据流中的运动矢量和残差等语法元素可用于替代光流进行运动表示,但其固有的像素噪声和特征稀疏性会影响精细动作的识别精度。对此,在对编码视频语法元素进行数据优化的基础上,设计了一个高精度、低复杂度的动态手势识别框架。首先,提出了关键P帧选择方法,通过选择信息量更高的编码帧解决了特征稀疏性问题;其次,提出了联合残差特征表示方法,利用残差得到精细的手势轮廓图,去除了运动矢量中手部以外的像素噪声;最后,设计了一种轻量而高效的动态手势识别模型,利用优化后的运动矢量和残差获得了类似于光流的计算效果。在viva,sheffield klnect gesture,NvGesture和EgoGesture等数据集上对所提方法进行了验证,实验结果显示,所提方法中仅使用RGB数据模式可达到的识别精度分别为82.94%、99.72%、81.12%和90.48%,降低了89%的存储开销,并且以4.7倍的运行速度获得了与先进方法相近的结果。
The syntax elements such as motion vectors(MVs)and residuals in encoding video data streams can substitute for optical flow in motion representation.But its inherent pixel noise and feature sparsity may also lead to some errors when fine movements are recognized.Hence,a dynamic gesture recognition framework is designed to get higher-precision and lower-complexity by using the data optimization of syntax elements in coding video.First,a key P-frame selection strategy is introduced to cope with the feature sparsity by selecting encoding frames which cover higher information content.Second,a joint residual feature representation method is proposed to remove the noisy MV not associated with the hand by using finer gesture contour maps obtained from residuals.Finally,a lightweight and efficient dynamic gesture recognition model is designed,leveraging optimized MVs and residuals to achieve a computation effect similar to optical flow.The proposed method is validated on datasets such as Viva dataset,sheffield klnect gesture(SKIG)dataset,NvGesture dataset,and EgoGesture dataset.The results of the experiments show that while using only RGB data,the recognition accuracy of the method mentioned was 82.94%,99.72%,81.12%and 90.48%respectively,reducing storage overhead by 89%and achieving results comparable to the advanced methods with a running speed 4.7 times faster.
作者
谢晓燕
曹盘宇
夏浩
陈雨馨
XIE Xiaoyan;CAO Panyu;XIA Hao;CHEN Yuxin(School of Computer,Xi'an University of Posts and Telecommunications,Xi'an 710121,China)
出处
《北京邮电大学学报》
EI
CAS
CSCD
北大核心
2024年第2期90-96,共7页
Journal of Beijing University of Posts and Telecommunications
基金
科技创新2030-“新一代人工智能”重大项目(2022ZD0119001)
国家自然科学基金项目(61834005,61772417)。
关键词
动态手势识别
编码视频
运动矢量
残差
数据优化
dynamic gesture recognition
encoded video
Motion Vector
residual
data optimization