摘要
人体动作识别因在公共安全方面具有重要的作用而在计算机视觉领域备受关注。然而,现有的图卷积网络在融合多尺度节点的邻域特征时,通常采用各阶邻接矩阵直接相加的方法,各项重要性一致,难以聚焦于重要特征,不利于最优节点关系的建立,同时采用对不同模型的预测结果求平均的双流融合方法,忽略了潜在数据的分布差异,融合效果欠佳。为此,文中提出了一种双流自适应注意力图卷积网络,用于对人体动作进行识别。首先,设计了能自适应平衡权重的多阶邻接矩阵,使模型聚焦于更加重要的邻域;然后,设计了多尺度的时空自注意力模块及通道注意力模块,以增强模型的特征提取能力;最后,提出了一种双流融合网络,利用双流预测结果的数据分布来决定融合系数,提高融合效果。该算法在NTU RGB+D的跨主体和跨视角两个子数据集上的识别准确率分别达92.3%和97.5%,在Kinetics-Skeleton数据集上的识别准确率达39.8%,均高于已有算法,表明了文中算法对于人体动作识别的优越性。
Human action recognition has received much attention in the field of computer vision because of its important role in public safety.However,when fusing the neighborhood features of multi-scale nodes,existing graph convolutional networks usually adopt a direct summation method,in which the same importance is attached to each feature,so it is difficult to focus on important features and is not conducive to the establishment of optimal nodal relationships.In addition,the two-stream fusion method,which averages the prediction results of different models,ignores the potential data distribution differences and the fusion effect is not good.To this end,this paper proposed a two-stream adaptive attention graph convolutional network for human action recognition.Firstly,a multi-order adjacency matrix that adaptively balances the weights was designed to focus the model on more important domains.Secondly,a multi-scale spatio-temporal self-attention module and a channel attention module were designed to enhance the feature extraction capability of the model.Finally,a two-stream fusion network was proposed to improve the fusion effect by using the data distribution of the two-stream prediction results to determine the fusion coefficients.On the two subdatasets of cross subject and cross view of NTU RGB+D,the recognition accuracy of the algorithm is 92.3%and 97.5%,respectively;while on the Kinetics-Skeleton dataset,it reaches 39.8%,both of which are higher than the existing algorithms,indicating the superiority of the algorithm in human motion recognition.
作者
杜启亮
向照夷
田联房
余陆斌
DU Qiliang;XIANG Zhaoyi;TIAN Lianfang;YU Lubin(School of Automation Science and Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China;China-Singapore International Joint Research Institute,South China University of Technology,Guangzhou 510555,Guangdong,China;Key Laboratory of Autonomous Systems and Network Control of the Ministry of Education,South China University of Technology,Guangzhou 510640,Guangdong,China;Research Institute of Modern Industrial Innovation,South China University of Technology,Zhuhai 519170,Guangdong,China)
出处
《华南理工大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2022年第12期20-29,共10页
Journal of South China University of Technology(Natural Science Edition)
基金
广东省海洋经济发展专项(GDNRC[2020]018)
广东省重点领域研发计划项目(2019B020214001,2018B010109001)
广州市产业技术重大攻关计划项目(2019-01-01-12-1006-0001)
华南理工大学中央高校基本科研业务费专项资金资助项目(2018KZ05)
华南理工大学研究生教育改革项目(zysk2018005)。
关键词
动作识别
图卷积网络
邻接矩阵
注意力
双流融合
action recognition
graph neural network
adjacency matrix
attention
two-stream fusion