基于图卷积及振幅约束的弱监督时序行为检测

Weakly supervised temporal action detection based on graph convolution and amplitude constraint

导出

摘要针对弱监督时序行为检测缺乏精确的行为起始和结束时间标注,导致时间维度信息匮乏等问题,提出基于挖掘视频片段间联系的方法,捕获一定程度上的时间维度信息,提高行为检测能力,本研究采用图卷积建模弱监督时序行为检测任务,用图节点表达视频片段的特征,图的边表达视频片段间的联系,使得行为检测网络不仅考虑了各视频片段的特征,还考虑了视频片段之间的联系.此外,利用振幅约束和背景约束进一步建模视频片段特征.在公开数据集上的实验结果表明本文方法相对于已有方法具有一定的性能优势. Aiming at the problems such as the lack of accurate start and end time marking in weakly supervised temporal action detection,which leads to the lack of time dimension information,a method based on mining the connection between video clips was proposed to capture a certain degree of time dimension information and improve the ability of action detection.In this paper,graph convolution was used to model the weakly supervised temporal action detection task.The graph nodes were used to express the features of video segments,and the graph edges were used to express the connections between video segments,so that the action detection network not only considers the features of each video segment,but also considers the connections between video segments.In addition,amplitude constraints and background constraints were used to further model video segment features.Experimental results on public datasets show that the proposed method has certain performance advantages over the existing methods.

作者桑农李致远 SANG Nong;LI Zhiyuan(School of Artificial Intelligence and Automation,Key Laboratory of Ministry of Education for Image Processing and Intelligent Control,Huazhong University of Science and Technology,Wuhan 430074,China)

机构地区华中科技大学人工智能与自动化学院

出处《华中科技大学学报（自然科学版）》 EI CAS CSCD 北大核心 2023年第2期77-81,共5页 Journal of Huazhong University of Science and Technology(Natural Science Edition)

基金国家自然科学基金资助项目(61871435)。

关键词图卷积弱监督振幅约束背景约束行为检测 graph convolution weakly supervised amplitude constraint background constraints action detection

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1王娟,蒋兴浩,孙锬锋.视频摘要技术综述[J].中国图象图形学报,2014,19(12):1685-1695. 被引量：33

二级参考文献51

1Maybury M T. Broadcast news understanding and navigation [ C ]//Proceedings of the Fifteenth Conference on Innovative Ap- plications of Artificial Intelligence. Trier, German: DBLP,2003 : 117-122.
2Pfeiffer S, Lienhart R, Ktthne G, et al. The MoCA project. [ M ]//Informatik'98. Berlin, Heidelberg: Springer, 1998 : 329- 338.
3Chang S F, Chen W, Meng H J, et al. VideoQ: an automated content based video search system using visual cues [ C ]//Pro- ceedings of the 5th ACM International Conference on Multimedia. New York, USA:ACM, 1997: 313-324.
4Snoek C G M, Worring M. Time interval maximum entropy based event indexing in soccer [ C ]//Proceedings of IEEE Internation- al Conference on Multimedia and Expo. Washington DC, USA: IEEE, 2003:481-484.
5Uchihashi S, Foote J, Girgensohn A, et al. Video manga: gener- ating semantieally meaningful video summaries [ C ]//Proceedings of the seventh ACM International Conference on Multimedia ( Part 1). New York, USA:ACM, 1999: 383-392.
6Zhuang Y, Rui Y, Huang T S, et al. Adaptive key frame extrac- tion using unsupervised clustering [ C ]// Proceedings of Interna- tional Conference on Image Processing. Washington DC, USA: IEEE, 1998, 1:866-870. [DOI:10. 1109/ICIP. 1998.723655].
7Almeida J, Torres R D S, Leite N J. Rapid video summarization on compressed video [ C ]// IEEE International Symposium on Multimedia. Washington DC, USA: IEEE, 2010: 113-120. [ DOI : 10. 1109/ISM. 2010. 25 ].
8Coldefy F, Bouthemy P. Unsupervised soccer video abstraction based on pitch, dominant color and camera motion analysis [ C ]//Proceedings of the 12th Annual ACM International Confer- ence on Multimedia. New York, USA : ACM, 2004 : 268-271.
9Wolf W. Key frame selection by motion analysis [ C ]//Proceed- ings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Washington DC, USA : IEEE, 1996, 2 : 1228- 1231. [DOI: 10. 1109/ICASSP. 1996. 543588 ].
10Chan W S, Au O C, Chong T S. Key frame selection by macrob- lock type and motion vector analysis [ C ]//Proceedings of Inter- national Conference on Multimedia and Expo. Washington DC, USA: IEEE, 2004, 1: 575-578. [DOI: 10.1109/ICME. 2004. 1394257 ].

共引文献32

1杨霜雪,刘晓丹.视频摘要技术的专利现状分析[J].中国发明与专利,2016,0(12):30-34.
2马元元,李向伟,刘艳飞.海量监控视频分级摘要生成系统研究[J].现代电子技术,2017,40(13):34-37. 被引量：5
3惠开发,成科扬,詹永照.基于改进ViBe算法的视频浓缩[J].山东大学学报（工学版）,2017,47(3):43-48. 被引量：1
4许彬,张海涛,胡豆豆.云计算平台中监控视频摘要任务调度方法研究[J].计算机应用与软件,2017,34(7):7-10. 被引量：6
5张亚洲,余正生.基于k-means++聚类的视频摘要生成算法[J].工业控制计算机,2017,30(7):129-130. 被引量：4
6叶锋,廖茜,汪敏,林贵增,陈超意,林晖.基于视频分析和多传感器融合的移动式监控系统[J].计算机系统应用,2017,26(8):88-93.
7张园,朱康,林荣生.汽车倒车影像抗干扰电路设计[J].自动化与仪器仪表,2017(10):58-59. 被引量：4
8冀中,樊帅飞.利用超图随机游走的视频摘要生成方法[J].小型微型计算机系统,2017,38(11):2535-2540. 被引量：2
9石亚玲,刘正熙,熊运余,李征.基于弱特征重识别的多目标长效摘要[J].计算机技术与发展,2018,28(5):27-31.
10徐艺琳,刘军,王琪.视频联合思维导图在行动静脉内瘘术患者健康教育中的应用[J].中西医结合护理（中英文）,2018,4(4):145-147. 被引量：18

1徐立青.基于机器视觉的汽车精密零件表面缺陷自动检测方法[J].自动化与仪器仪表,2022(11):36-39. 被引量：3
2李宗伟.基于视觉传达的复杂图像抗干扰特征自动化监控方法[J].流体测量与控制,2023,4(1):58-62. 被引量：1
3刘健.老年患者住院及出院后8周卫生健康服务质量研究[J].中国老年保健医学,2023,21(1):75-79.

华中科技大学学报（自然科学版）

2023年第2期

浏览历史

内容加载中请稍等...

基于图卷积及振幅约束的弱监督时序行为检测

参考文献1

二级参考文献51

共引文献32

相关作者

相关机构

相关主题

浏览历史