摘要
动作识别是计算机视觉基础任务之一,骨架序列包含了大部分的动作信息,因此基于骨架的动作识别算法受到很多学者关注。人体骨架在数学上是一个天然的图,所以图卷积被广泛应用于动作识别。但普通的图卷积只聚合两两节点间的低阶信息,不能建模多节点间的高阶复杂关系。针对此问题,本文提出一种多尺度超图卷积网络,在空间和时间两个维度聚合更丰富的信息,提高动作识别准确度。多尺度超图卷积网络采用编解码结构,编码器使用超图卷积模块聚合超边中多个节点间的相关信息,解码器使用超图融合模块恢复原始骨架结构,另外基于空洞卷积设计了多尺度时间图卷积模块以更好地聚合时间维度运动信息。NTURGB+D和Kinetics数据集上的实验结果验证了算法的有效性。
Action recognition is one of the basic tasks of computer vision.The skeleton sequence contains most of the action information,so skeleton-based action recognition has attracted a lot of research attention.Mathematically,the human skeleton is a natural graph,so graph convolution is widely used in action recognition.But ordinary graph convolution only aggregates low-order information between pairwise nodes,and cannot model high-order complex relationships between multiple nodes.To solve this problem,a multiscale hypergraph convolutional network is proposed,which aggregates richer information in the two dimensions of space and time,so as to improve the accuracy of action recognition.The multiscale hypergraph convolutional network has an encoderdecoder structure.The encoder uses the hypergraph convolution module to aggregate relevant information between multiple nodes in the hyperedge,and the decoder uses the hypergraph fusion module to restore the original skeleton structure.In addition,a multiscale temporal graph convolution model based on dilated convolution is designed,which is used to better aggregate the temporal-dimension motion information.The experimental results on NTU-RGB+D and Kinetics datasets verify the effectiveness of this algorithm.
作者
秦晓飞
赵颖
张逸杰
杜睿杰
钱汉文
陈萌
张文奇
张学典
QIN Xiaofei;ZHAO Ying;ZHANG Yijie;DU Ruijie;QIAN Hanwen;CHEN Meng;ZHANG Wenqi;ZHANG Xuedian(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China;Institute of Aerospace System Engineering of Shanghai,Shanghai 201109,China)
出处
《光学仪器》
2022年第4期39-48,共10页
Optical Instruments
基金
上海市人工智能计划(2019-RGZN-01077)。
关键词
动作识别
图卷积
超图卷积
空洞卷积
action recognition
graph convolution
hypergraph convolution
dilated convolution