摘要
目前快递物流行业普遍存在分拣人员暴力分拣现象,为减少此类行为可采用基于图像的行为识别方法,但这种方法在实际场景中存在算法鲁棒性差、人体关节点数据难获取等问题。针对上述问题,制作了1个物流暴力分拣行为视频数据集,研究了暴力分拣行为识别模型。通过树莓派采集室内外2种情景下的分拣视频数据,利用Python socket模块实现视频图像实时传输,采用切片筛选规则除去非标准数据,应用OpenPose模型获取关节点数据。针对一般人体行为识别网络模型无法较好反映暴力分拣关节点对动作重要影响程度的问题,研究了以ST-GCN为主干网络的优化图神经网络模型ST-AGCN。利用空间注意力机制学习不同关节点对于各种动作的影响,以更新各关节点的权重;通过增加自适应图结构层以端到端学习方式将人体骨骼图的拓扑结构与网络参数共同优化,突出关联度高的关节点对动作识别的影响。以室内外环境下暴力分拣视频为对象开展和多种深度学习模型的对比实验和消融实验,实验结果表明:ST-AGCN模型识别现实场景中暴力分拣行为的准确率相比ST-GCN、STA-LSTM、不含空间注意力机制的ST-AGCN和不含自适应图结构层的ST-AGCN模型分别提高了5.6%,13.82%,2.36%,1.61%,且适用于室内外环境杂乱、局部遮挡等复杂的物流分拣场景,验证了ST-AGCN的优越性以及空间注意力机制和自适应图结构层的有效性。
An image-based behavior recognition method can be utilized to address the issue of violent sorting which is prevalent within the express logistics industry.However,this method presents challenges including algorithmic fragility and the difficulty in obtaining joint point data in practical scenarios.In response to these challenges,a video dataset is generated to capture instances of violent sorting behaviors in logistics,and a model is developed to identify such behaviors.Video data from both indoor and outdoor scenarios is collected,with real-time video image transmission achieved using the Python socket module.Screening rules are applied to eliminate non-standard data,and the OpenPose model is employed to obtain joint data.To address the limitation of general recognition network in reflecting the impact of joint points on actions,an optimized graph neural network is developed based on ST-GCN.The spatial attention mechanism is used to understand the influence of different joints on various movements,updating the weight of each joint.The topology and network parameters of the human bone map are optimized through end-to-end learning to emphasize the influence of key joints on action recognition.Comparative and ablation experiments are conducted on various deep learning models using violent sorting videos captured in indoor and outdoor environments.The experimental results indicate that the accuracy of ST-AGCN model for identifying violent sorting behavior in real scenes is 5.6%higher than ST-GCN.Compared with STA-LSTM,ST-AGCN without spatial attention mechanism,and ST-AGCN without the adaptive graph structure layer,the accuracy of ST-AGCN model is improved by 13.82%,2.36%,and 1.61%respectively,which indicates the ST-AGCN model is also suitable for complex logistics sorting scenes in cluttered indoor and outdoor environments and partial occlusion,and verifies the superiority of ST-AGCN and the effectiveness of the spatial attention mechanism and the adaptive graph structure layer.
作者
曹菁菁
余宙
李鹏飞
闵艳萍
黄齐贤
赵强伟
CAO Jingjing;YU Zhou;LI Pengfei;MIN Yanping;HUANG Qixian;ZHAO Qiangwei(School of Transportation and Logistics Engineering,Wuhan University of Technology,Wuhan 430063,China)
出处
《交通信息与安全》
CSCD
北大核心
2023年第5期115-126,共12页
Journal of Transport Information and Safety
基金
国家自然科学基金青年项目(No.61502360)资助。
关键词
智能物流
暴力分拣
时空图卷积网络模型
自适应图结构层
人体行为识别
Intelligent logistics
violent sorting
spatial temporal graph convolutional networks
adaptive graph structure layer
human activity recognition