结合姿态估计和时序分段网络分析的羽毛球视频动作识别被引量：2

Stroke recognition in badminton videos based on pose estimation and temporal segment networks analysis

导出

摘要目的为了满足羽毛球教练针对球员单打视频中的动作进行辅助分析,以及用户欣赏每种击球动作的视频集锦等多元化需求,提出一种在提取的羽毛球视频片段中对控球球员动作进行时域定位和分类的方法。方法在羽毛球视频片段上基于姿态估计方法检测球员执拍手臂,并根据手臂的挥动幅度变化特点定位击球动作时域,根据定位结果生成元视频。将通道—空间注意力机制引入时序分段网络,并通过网络训练实现对羽毛球动作的分类,分类结果包括正手击球、反手击球、头顶击球和挑球4种常见类型,同时基于图像形态学处理方法将头顶击球判别为高远球或杀球。结果实验结果表明,本文对羽毛球视频片段中动作时域定位的交并比(intersection over union, IoU)值为82.6%,对羽毛球每种动作类别预测的AUC(area under curve)值均在0.98以上,平均召回率与平均查准率分别为91.2%和91.6%,能够有效针对羽毛球视频片段中的击球动作进行定位与分类,较好地实现对羽毛球动作的识别。结论本文提出的基于羽毛球视频片段的动作识别方法,兼顾了羽毛球动作时域定位和动作分类,使羽毛球动作识别过程更为智能,对体育视频分析提供了重要的应用价值。 Objective Video-based intelligent action recognition has been developing for computer vision analysis nowadays. It is required to recognize action in a specific scene of video due to such multiple video types. To appreciate sports leisure for users like the meta-video set of various badminton stroke, it can assist coaches to analyze stroke better if badminton strokes can be accurately located and recognized in a badminton video. Sports video analysis like the approach of the badminton stroke recognition can be transferred to tennis and table tennis via similar sports features. For a long time span of video based action recognition method, it is necessary to locate the action time domain. Badminton-oriented video can be as this kind of videos to locate stroke time domains. For the time domain localization of video actions, current research is focused on a clear action switching boundary between adjacent actions in a video, and the foreground or background features of adjacent actions are quite different, such as the action video dataset 50 Salads and dataset Breakfast. However, there is no obvious boundary information between foreground and background of adjacent strokes in a badminton video. Therefore, the action recognition based long time span video is not suitable for the localization of badminton strokes. In addition, most existing researches on badminton stroke recognition are based on a static image of a stroke derived from a badminton video, and the stroke recognition of badminton-relevant meta-video is lacking. Our method is focused on an approach for locating and classifying the strokes of ball-control player in an extracted badminton video highlight. Method First, the pose estimation model regional multi-person pose estimation(RMPE) is used to detect human poses in a badminton video highlight. The pose of the targeted player is located via adding prediction scores and position constraints to shield other irrelevant factors of human bones. For the detected pose of targeted player, the node constraints are added to locate arms of the player. The holding arm and the non-holding arm are distinguished according to the difference of the swinging amplitude, and the time domain localization of badminton stroke is carried out by the swinging amplitude variation of the holding arm for extracting the meta-video of badminton stroke. The swing amplitude of the player’s arm in a frame is defined as the linear weighted sum of the square of the upper and lower limbs swing vector modulus. Then, the dataset of badminton meta-videos is applied to train convolutional block attention module-temporal segment networks(CBAM-TSN) for predicting badminton strokes in meta-videos, which add convolutional block attention module in temporal segment networks. It is necessary to extract two-stream of meta-videos from dataset beforehand through training CBAM-TSN because temporal segment network(TSN) inherited the structure of two-stream convolutional neural network(CNN). The two-stream is composed of spatial stream(RGB frames) and temporal stream(optical frames). The predicted stroke from the model of CBAM-TSN contains four familiar types: forehand, backhand, overhead and lob. Finally, we classify the overhead scenario into clear or smash by morphology processing, the clear-oriented meta-videos tend to continuous dynamic mask in the background area at the end of the stroke, but the smash-oriented meta-videos have no continuous dynamic mask information in the background area. Our badminton mask in a meta-video is captured based on the result of images morphological processing. The strokes of clear and smash can be distinguished based on position-relevant features of the badminton mask. Result In a highlighted badminton video, it shows that the segmentation is correct if a meta video segmented by the method of strokes localization and a meta video extracted manually both contain the same badminton stroke. Our indicator of intersection over union(IoU) is used to evaluate the performance of strokes localization. Furthermore, the performance of badminton strokes classification is evaluated via using machine learning based indicator ROC-AUC, recall and precision. The experiment results show that our IoU of stroke localization in badminton video highlights is reached to 82.6%. The indicator AUC about four kinds of badminton strokes(forehand, backhand, overhead and lob) predicted by the model of CBAM-TSN is all over 0.98, the micro-AUC, macro-AUC, average recall and precision is reached to 0.990 8, 0.990 3, 93.5% and 94.3%,respectively. In addition, the CBAM-TSN is compared to the three popular approaches of action recognition in the context of badminton strokes recognition, gets the highest result on precision, micro-AUC and macro-AUC. The final average recall and precision is reached to 91.2% and 91.6% of each. Therefore, it can effectively locate and classify major player’s strokes in a badminton video highlight. Conclusion We facilitate a novel badminton strokes recognizing method in badminton video highlights, which is in combination with badminton stroke localization and badminton stroke classification. The potential sports video analysis is developed further.

作者陶树王美丽 Tao Shu;Wang Meli(College of Information Engineering,Northwest A&F University,Yangling 712100,China;Key Laboratory of Agricultural Internet of Things,Ministry of Agriculture and Rural Affairs,Yangling 712100,China;Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service,Yangling 712100,China)

机构地区西北农林科技大学信息工程学院农业农村部农业物联网重点实验室陕西省农业信息与智能服务重点实验室

出处《中国图象图形学报》 CSCD 北大核心 2022年第11期3280-3291,共12页 Journal of Image and Graphics

基金国家自然科学基金项目(61402374) 农村农业部农业物联网重点实验室项目(2018AIOT-09) 陕西省重点研发计划农业农村领域一般项目(2019NY-167)。

关键词姿态估计元视频羽毛球动作定位注意力机制—时序分段网络(CBAM-TSN) 形态学处理羽毛球动作识别 pose estimation meta video badminton stroke localization convolutional block attention module-temporal segment network(CBAM-TSN) morphological processing badminton stroke recognition

分类号 TP399 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献4

1冯林,刘胜蓝,王静,肖尧.人体运动分割算法:序列局部弯曲的流形学习[J].计算机辅助设计与图形学学报,2013,25(4):460-467. 被引量：7
2沈晴,班晓娟,常征,郭靖.基于视频的人机交互中动作在线发现与时域分割[J].计算机学报,2015,38(12):2477-2487. 被引量：5
3熊成鑫,郭丹,刘学亮.时域候选优化的时序动作检测[J].中国图象图形学报,2020,25(7):1447-1458. 被引量：2
4杨静.体育视频中羽毛球运动员的动作识别[J].自动化技术与应用,2018,37(10):120-124. 被引量：11

二级参考文献41

1张振跃,查宏远.Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment[J].Journal of Shanghai University(English Edition),2004,8(4):406-424. 被引量：67
2杨跃东,王莉莉,郝爱民,封春升.基于几何特征的人体运动捕获数据分割方法[J].系统仿真学报,2007,19(10):2229-2234. 被引量：9
3Kovashka A, Grauman K. Learning a hierarchy of discrimi?native space-time neighborhood features for human action recognition/ /Proceedings of the 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA, 2011: 2046-2053.
4Fathi A, Mori G. Action recognition by learning mid-level motion features/ /Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Anchorage, USA, 2008: 1-8.
5Ahmad M, Lee S W. Human action recognition using shape and CLG-motion flow from multi-view image sequences. Pattern Recognition, 2008, 41(7): 2237-2252.
6Hoai M, Lan Zhen-Zhong, De la Torre F. Joint segmentation and classification of human actions in video/ /Proceedings of the 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA, 2011: 3265-3272.
7Lu Guo-Liang, Kodo M, Toyama J. Temporal segmentation and assignment of successive action in a long-term video. Pattern Recognition Letters, 2013, 34(15): 1936-1944.
8Bobick A, Davis J. The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(3): 257-268.
9Weinland D, Ronford R, Boyer E. Free viewpoint action recognition using motion history volumes. Compter Vision and Image Understanding, 2006, 104(2): 249-257.
10Zelnik-Manor L, Irani M. Statistical analysis of dynamic actions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(9): 1530-1535.

共引文献21

1王鑫,沃波海,陈良秀,管秋,陈胜勇.基于局部匹配窗口的动作识别方法[J].计算机辅助设计与图形学学报,2014,26(10):1764-1773. 被引量：2
2徐向艺,廖梦怡.视频识别中基于簇的在线运动分割算法研究[J].微型电脑应用,2014,30(11):20-24. 被引量：1
3石祥滨,赵林,代钦,张德园.基于关节联动特征的运动捕获数据分割方法[J].系统仿真学报,2014,26(11):2636-2641. 被引量：1
4杜战战,孙怀江.基于PGA的人体运动捕获数据分割方法[J].计算机工程与应用,2016,52(10):141-146. 被引量：1
5张奕,王科琪.基于MVU降维的捕捉数据自动分割[J].电子技术与软件工程,2017(19):182-184.
6熊小勇,谢文强,张韧博,郭斌,王文乐,吴少智.基于时空分割的三维动画压缩[J].计算机应用与软件,2018,35(5):208-212. 被引量：2
7高翔,陈志,岳文静,龚凯.基于视频场景深度学习的人物语义识别模型[J].计算机技术与发展,2018,28(6):53-58. 被引量：1
8杨茜,李书杰,张迎凯,刘晓平.多粒度运动分割的时频分析算法[J].计算机辅助设计与图形学学报,2017,29(12):2288-2302. 被引量：1
9冯安琪,钱丽萍,黄玉蘋,吴远.RFID环境下基于自适应卡尔曼滤波的高速移动车辆速度预测[J].计算机科学,2019,46(4):100-105. 被引量：6
10张广鹏.基于PCA-LBP算法的竞技赛技术动作行为识别[J].现代科学仪器,2019,0(4):80-83. 被引量：1

同被引文献27

1尹兴超,郭瑜,樊家伟,邹翔,陈鑫.增量式光学编码器IAS信号误差建模及补偿[J].仪器仪表学报,2023,44(2):50-58. 被引量：1
2张寅,顾恩臣,闫钧华,谢巍,邹弘扬.基于相位拟合的绝对式光电精密测角方法[J].仪器仪表学报,2023,44(2):15-23. 被引量：1
3成凌霄.基于双目视觉系统对乒乓球挥拍动作的识别[J].办公自动化,2021,26(21):55-56. 被引量：2
4李尚滨,王德才,刘英爽,赵培禹.基于物联网的体育教学平台设计[J].体育学刊,2015,22(1):90-94. 被引量：17
5韩庆阳,陈赟,张红胜,高胜英,张晰.耐高温增量式光电编码器的研制[J].光学精密工程,2019,27(7):1458-1464. 被引量：4
6陈赟,高胜英,韩庆阳,张晰.多面体塔差及其安装偏心对光电编码器精度检测的影响[J].光学精密工程,2019,27(8):1704-1709. 被引量：9
7苏坚贞,陈曦.学生体质健康测试区块链的平台架构、应用前景与现实挑战[J].体育学研究,2020,34(1):21-26. 被引量：20
8马力辉,果文佳,孙辉,马凯.基于神经网络传统太极拳动作的识别分析[J].光电子．激光,2021,32(3):257-265. 被引量：3
9张建辉,陈震林,张帆.绝对式光电编码器的编码理论研究进展[J].振动．测试与诊断,2021,41(1):1-12. 被引量：14
10蔡敬鹏,杨向萍,周宇,袁帅,陈家新.伺服电机编码器自动校装系统开发[J].传感器与微系统,2021,40(5):78-80. 被引量：4

引证文献2

1杨耿,梁俊威,蔡铁,李钦,郑家帆.新时代学校体育评价智慧大脑设计与构建研究[J].当代体育科技,2023,13(18):103-110. 被引量：1
2顾庆传,张靖,周丽,李鑫,朱豪,张鹏坤.基于图像识别的角度传感器设计[J].传感器与微系统,2024,43(2):113-115.

二级引证文献1

1陈凯.初中体育组织者角色下的体育信息技术应用与探索[J].文体用品与科技,2024(3):109-111.

1张学文,龙佩林,欧阳友金,孙锦秀,龚子飞,杨慧.探究式教学在大学体育教学中的应用研究--以吉首大学为例[J].贵州体育科技,2022(3):16-19.
2林立文.专家把脉[J].羽毛球,2023(3):80-81.
3夏玉婷:“公主”要杀球[J].羽毛球,2023(5):66-69.
4江冠谊,周志辉.世界优秀女子单打运动员戴资颖后场技术分析[J].运动-休闲（大众体育）,2023(2):0130-0132.
5唐静芳.谈合作学习在小学语文阅读教学中的应用[J].中文科技期刊数据库（引文版）教育科学,2021(9):0067-0067.
6林珺.基于5G网络架构特点定位双模站点下SA用户速率异常的方法[J].通信电源技术,2022,39(21):103-106. 被引量：1
7刘医,黄海涛.专家坐诊[J].羽毛球,2023(3):82-83.
8徐利.羽毛球正手高远球教学中易犯错误动作分析及对策[J].新体育（下半月）,2023(1):37-40.
9彭豪,李晓明.利用金字塔空间注意力与特征推理的图像修复[J].计算机辅助设计与图形学学报,2023,35(1):87-98. 被引量：1
10吴建宁,林秋婷,伍滨.基于核主成分分析的相关向量机人体动作分类新型模型[J].中国生物医学工程学报,2022,41(6):641-649. 被引量：3

中国图象图形学报

2022年第11期

浏览历史

内容加载中请稍等...

结合姿态估计和时序分段网络分析的羽毛球视频动作识别被引量：2

参考文献4

二级参考文献41

共引文献21

同被引文献27

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

结合姿态估计和时序分段网络分析的羽毛球视频动作识别 被引量：2

参考文献4

二级参考文献41

共引文献21

同被引文献27

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

结合姿态估计和时序分段网络分析的羽毛球视频动作识别被引量：2