基于结构重参数化和注意力机制的复杂背景下手势识别

Hand gesture recognition in complex background based on structure reparameterization and attention mechanism

下载PDF

导出

摘要针对复杂背景下手势图像受到干扰较多而导致的手势识别准确率低、识别速度慢问题,提出一种基于结构重参数化和注意力机制的复杂背景下手势识别算法RepSEHGR(re-parameter squeeze-expand hand gesture recognition)。通过使用结构重参数化方法,将其应用到残差结构中,在部署阶段去除多余分支结构,提升算法识别速度;同时嵌入通道注意力机制模块,利用其为不同通道特征加权的特点使算法关注手势特征,减少复杂背景干扰;使用cutout与仿射变换2种数据增强方法训练算法,抑制复杂背景噪声输入并增强数据,减少过拟合的同时提升算法健壮性。在一个复杂背景手势数据集上进行对比实验,结果显示:识别精度达到99.9%,识别速度达到200 fps,证明了所提算法的有效性。 As a highly adaptive form of interaction in human-computer interaction, gestures can simplify interactions by eliminating physical contacts between mechanical devices and their users. Gesture interaction provides more intuitive interaction and richer interaction effects, better meeting people’s needs and expectations for interaction. Gesture recognition has been widely researched in the field of human-computer interaction, especially gesture recognition based on machine vision thanks to its low cost, being more natural and non-contact. However, the existing gesture recognition methods are primarily based on simple experimental environment background. In the actual human-computer interaction, gesture recognition usually occurs in various complex environments.In practice, changes in brightness, complex backgrounds, and color-like interference are key factors affecting the accuracy of gesture recognition. The interference caused by complex background greatly affects the extraction of gesture features, making it difficult to recognize gestures quickly and accurately. Some researchers employ a two-stage model to first extract gesture areas and then identify them, while others directly use deep convolutional neural networks to identify complex background gestures. However, the recognition speed of the two-stage gesture recognition method hardly meets the requirements in practical applications, and the accuracy of the single-stage gesture recognition method needs to be further improved for the gesture image recognition of complex background. The existing gesture recognition methods are unable to solve the problems of gesture recognition in the actual complex background due to their difficulties in striking a balance between recognition speed and accuracy. To remedy this, the key lies in how to eliminate or weaken the interference of complex background on the basis of improving the recognition speed of the algorithm, or how to enhance the ability of gesture feature extraction, so that the gesture recognition algorithm can correctly represent the gesture information. The attention mechanism can imitate the principle of human visual system’s attention to objects, by increasing the attention to the target area to achieve the detailed information of the target area. Embedding attention mechanism in gesture recognition algorithm based on deep learning can allow the algorithm to focus on the feature of target gesture area and eliminate the interference of complex background. Meanwhile, the structure reparameterization method can remove the redundant branch structure in the deployment stage and improve the algorithm recognition speed.To remedy such problems as low recognition accuracy and slow recognition speed caused by more interference in gesture images under complex background, a gesture recognition algorithm RepSEHGR based on structural reparameterization and attention mechanism is proposed. By using the structure reparameterization method, it is applied to the residual structure to remove the redundant branch structure in the deployment stage and improve the algorithm recognition speed. Meanwhile, the channel attention mechanism module is embedded to enable the algorithm to attend to gesture features by weighted features of different channels, thus reducing complex background interference. Finally, two data enhancement methods, cutout and affine transformation, are employed to train the algorithm, suppress complex background noise input and enhance the data, reduce overfitting and improve the robustness of the algorithm. Comparison experiments on a complex background gesture data set show the recognition accuracy reaches 99.9% and the recognition speed 200FPS, demonstrating the effectiveness of the proposed algorithm.

作者杨黎霞夏天陈仁祥张晓邱天然 YANG Lixia;XIA Tian;CHEN Renxiang;ZHANG Xiao;QIU Tianran(School of Business Administration,Chongqing University of Science and Technology,Chongqing 401331,China;Chongqing Engineering Laboratory for Transportation Engineering Application Robot,Chongqing Jiaotong University,Chongqing 400074,China)

机构地区重庆科技大学工商管理学院重庆交通大学交通工程应用机器人重庆市工程实验室

出处《重庆理工大学学报（自然科学）》 CAS 北大核心 2023年第12期201-209,共9页 Journal of Chongqing University of Technology：Natural Science

基金国家自然科学基金项目(51975079) 重庆市教委科学技术研究项目(KJZD-M202200701) 重庆市研究生联合培养基地项目(JDLHPYJD2021007) 重庆市专业学位研究生教学案例库(JDALK2022007) 重庆市研究生教育课程思政示范项目(YKCSZ23128) 重庆市研究生科研创新项目(2023S0072) 重庆科技大学科研启动项目(ckre202212030)。

关键词手势识别注意力机制复杂背景结构重参数化数据增强 gesture recognition attention mechanism complex background structure reparameterization data augmentation

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献8

1杨尊俭,张淑军.基于DTW和CNN的仿真驾驶手势识别及交互[J].重庆理工大学学报（自然科学）,2021,35(2):144-151. 被引量：5
2张淑军,王帅,李辉.基于CNN和BLSTM的连续手语识别[J].重庆理工大学学报（自然科学）,2022,36(4):177-186. 被引量：6
3辛文斌,郝惠敏,卜明龙,兰媛,黄家海,熊晓燕.基于ShuffleNetv2-YOLOv3模型的静态手势实时识别方法[J].浙江大学学报（工学版）,2021,55(10):1815-1824. 被引量：12
4卢迪,马文强.基于改进YOLOv4-tiny算法的手势识别[J].电子与信息学报,2021,43(11):3257-3265. 被引量：25
5范晶晶,薛皓玮,吴欣鸿,王美丽.引入重影特征映射和通道注意力机制的手势识别算法[J].计算机辅助设计与图形学学报,2022,34(3):403-414. 被引量：5
6丛玉华,何啸,邢长达,王志胜.基于计算机视觉手势识别的人机交互技术研究[J].兵器装备工程学报,2022,43(1):152-160. 被引量：17
7彭玉青,赵晓松,陶慧芳,刘宪姿,李铁军.复杂背景下基于深度学习的手势识别[J].机器人,2019,41(4):534-542. 被引量：29
8王晓华,姚炜铭,王文杰,张蕾,李鹏飞.基于改进YOLO深度卷积神经网络的缝纫手势检测[J].纺织学报,2020,41(4):142-148. 被引量：7

二级参考文献30

1朱继玉,王西颖,王威信,戴国忠.基于结构分析的手势识别[J].计算机学报,2006,29(12):2130-2137. 被引量：26
2杨波,宋晓娜,冯志全,郝晓艳.复杂背景下基于空间分布特征的手势识别算法[J].计算机辅助设计与图形学学报,2010,22(10):1841-1848. 被引量：52
3王鑫,沃波海,管秋,陈胜勇.基于流形学习的人体动作识别[J].中国图象图形学报,2014,19(6):914-923. 被引量：30
4杨学文,冯志全,黄忠柱,何娜娜.结合手势主方向和类-Hausdorff距离的手势识别[J].计算机辅助设计与图形学学报,2016,28(1):75-81. 被引量：21
5孟勃,刘雪君,王晓霖.基于四元数时空卷积神经网络的人体行为识别[J].仪器仪表学报,2017,38(11):2643-2650. 被引量：17
6班蕊,丁丹丹,张明敏,沈华清.基于体感的在线互动教育游戏设计与实现[J].系统仿真学报,2017,29(11):2890-2897. 被引量：8
7张勋,陈亮,胡诚,孙韶媛.一种基于深度学习的静态手势实时识别方法[J].现代计算机,2017,23(23):6-11. 被引量：13
8蒋昂波,王维维.ReLU激活函数优化研究[J].传感器与微系统,2018,37(2):50-52. 被引量：100
9易靖国,程江华,库锡树.视觉手势识别综述[J].计算机科学,2016,43(S1):103-108. 被引量：63
10马杰,张绣丹,杨楠,田亚蕾.融合密集卷积与空间转换网络的手势识别方法[J].电子与信息学报,2018,40(4):951-956. 被引量：12

共引文献92

1赵倩,杨一聪.多重金字塔的轻量化遥感车辆小目标检测算法[J].电子测量技术,2023,46(13):88-94. 被引量：1
2张广世,葛广英,朱荣华,孙群.基于改进YOLOv3网络的齿轮缺陷检测[J].激光与光电子学进展,2020,57(12):145-153. 被引量：31
3姚炜铭,王晓华,吴楠.基于改进SSD模型的缝纫手势图像检测方法[J].激光与光电子学进展,2020,57(18):181-188. 被引量：7
4程冉,史健芳.基于卷积神经网络的手势识别算法研究[J].电子设计工程,2021,29(2):179-184. 被引量：4
5郝永平,曹昭睿,白帆,孙颢洋,王兴,秦洁.基于兴趣区域掩码卷积神经网络的红外-可见光图像融合与目标识别算法研究[J].光子学报,2021,50(2):76-90. 被引量：17
6陈莹,黄永彪,谢瑾.人工智能辅助下人机交互隔空手势识别模型[J].计算机仿真,2021,38(3):360-364. 被引量：5
7程志宇,徐国庆,许犇,罗京.复杂背景下无锚框手势检测网络的构建[J].计算机工程与设计,2021,42(6):1742-1748. 被引量：1
8毛堃,吴小敏,王超素.基于改进DTW的上肢康复运动评价[J].科技视界,2021(16):119-121. 被引量：1
9鹿智,秦世引,李连伟,张鼎豪.智能人机交互中第一视角手势表达的一次性学习分类识别[J].自动化学报,2021,47(6):1284-1301. 被引量：6
10闫俊伢,吴迪,滕华.基于深度卷积神经网络和支持向量机的手势识别算法[J].济南大学学报（自然科学版）,2021,35(5):446-451. 被引量：7

1肖章,彭江,刘俊杰,孙二杰,彭如恕.基于YOLOv5-CP的复杂环境下油茶果检测[J].中国农机化学报,2023,44(12):193-199.
2ZOU YongXiang,CHENG Long,HAN LiJun,LI ZhengWei.Multi-modal fusion for robust hand gesture recognition based on heterogeneous networks[J].Science China(Technological Sciences),2023,66(11):3219-3230.
3段玉,刘善伟,万剑华,MUHAMMAD Yasir,郑爽.多尺度融合卷积神经网络支持下的SAR影像变化检测[J].测绘通报,2023(12):31-37.
4陈劲星.计算机视觉结合深度学习技术快速鉴别八角粉掺伪[J].食品与机械,2023,39(12):42-47.
5侯北平,李丰余,朱文,胡飞阳.基于改进U-Net的高压电缆绝缘层图像分割研究[J].电子测量与仪器学报,2023,37(10):232-243. 被引量：1
6程之星,杨帆.融合SimAM注意力机制的实时多目标跟踪算法[J].电子测量技术,2023,46(17):94-101. 被引量：1
7吴骏,徐天,于坤.基于深度学习的机织物起毛起球客观评级分析[J].现代纺织技术,2024,32(1):1-8.
8庞程,梁安健,丁春光,李颖.基于领域驱动设计与微服务重构元器件信息平台[J].电子产品可靠性与环境试验,2023,41(6):37-42.
9杨森泉,丁凡,文昊翔,李璞,胡松喜.基于CA-YOLOv5的热轧带钢表面缺陷检测方法[J].光电子．激光,2024,35(1):21-28. 被引量：1
10向浩,祁信舒,吕现伟,梁思,沈佳洁.基于提示信息的分割大模型建筑物提取方法研究[J].地理空间信息,2024,22(1):29-32.

重庆理工大学学报（自然科学）

2023年第12期

浏览历史

内容加载中请稍等...

基于结构重参数化和注意力机制的复杂背景下手势识别

参考文献8

二级参考文献30

共引文献92

相关作者

相关机构

相关主题

浏览历史