摘要
针对基于图卷积的骨架行为识别方法在建模关节特征时严重依赖手工设计图形拓扑,缺乏建模全局关节间依赖关系的缺点,设计了一种时空卷积Transformer实现对空间和时间关节特征的建模。空间关节特征建模中,提出一种动态分组解耦Transformer,通过将输入骨架序列在通道维度进行分组并为每个组动态生成不同的注意力矩阵,允许建模关节之间的全局空间依赖关系,无需事先知道人体拓扑结构。时间关节特征建模中,通过多尺度时间卷积实现对不同时间尺度行为特征的提取。最后,提出一种时空-通道联合注意力模块,进一步对所提取到的时空特征进行修正。在NTU-RGB+D和NTU-RGB+D 120数据集的跨主体评估标准上达到了92.5%和89.3%的Top1识别准确率,实验结果表明了所提方法的有效性。
In the methon of skeleton action recognition based on graph convolution,the rely heavily on hand-designed graph topology in modelling joint features,and lack the ability to model global joint dependencies.To address this issue,we proposed a spatio-temporal convolutional Transformer network to implement the modelling of spatial and temporal joint features.In the spatial joint feature modeling,we proposed a dynamic grouping decoupling Transformer that grouped the input skeleton sequence in the channel dimension and dynamically generated different attention matrices for each group,establishing global dependencies between joints without requiring knowledge of the human topology.In the temporal joint feature modeling,multi-scale temporal convolution was used to extract features of target behaviors at different scales.Finally,we proposed a spatio-temporal channel joint attention module to further refine the extracted spatio-temporal features.The proposed method achieved Top1 recognition accuracy rates of 92.5% and 89.3% on the cross-subject evaluation criteria for the NTU-RGB+D and NTU-RGB+D 120 datasets,respectively,demonstrating its effectiveness.
作者
刘斌斌
赵宏涛
王田
杨艺
Liu Binbin;Zhao Hongtao;Wang Tian;Yang Yi(Zhengzhou Hengda Intelligent Control Technology Company Limited,Zhengzhou 450000,China;School of Electrical Engineering and Automation,Henan Polytechnic University,Jiaozuo 454003,China;Research Institute for Artificial Intelligence,Beihang University,Beijing 100191,China)
出处
《电子测量技术》
北大核心
2024年第1期169-177,共9页
Electronic Measurement Technology
基金
国家自然科学基金(61972016)项目资助。