期刊文献+

融合视觉关系检测的电力场景自动危险预警 被引量:7

Visual relationship detection-based emergency early-warning description generation in electric power industry
原文传递
导出
摘要 目的借助深度学习强大的识别与检测能力,辅助人工进行电力场景下的危险描述与作业预警是一种较为经济和高效的电力安全监管手段。然而,目前主流的以目标检测技术为基础的预警系统只能给出部分危险目标的信息,忽视了电力设备的单目危险关系和成对对象间潜在的二元危险关系。不同于以往的方法,为了拓展危险预警模块的识别能力与功能范畴,本文提出了一种在电力场景下基于视觉关系检测的自动危险预警描述生成方法。方法对给定的待检测图像,通过目标检测模块得到图中对象的类别名称和限界框位置;分别对图像进行语义特征、视觉特征和空间位置特征的抽取,将融合后的总特征送入关系检测模块,输出单个对象的一元关系和成对对象间的关系三元组;根据检测出的对象类别和关系信息,进行危险预测并给出警示描述。结果本文自主搜集了多场景下的电力生产作业图像并进行标注,同时进行大量消融实验。实验显示,结合了语义特征、空间特征和视觉特征的关系检测器在前5召回率Recall@5和前10召回率Recall@10上的精度分别达到86.80%和93.93%,比仅使用视觉特征的关系检测器的性能提高约15%。结论本文提出的融合多模态特征输入的视觉关系检测网络能够较好地给出谓词关系的最佳匹配,并减少不合理的关系预测,且具有一定零样本学习(zero-shot learning)能力。相关可视化结果表明,整体系统能够较好地完成电力场景下的危险预警描述任务。 Objective The past decade has seen a steady increase in deep learning areas,where extensive research has been published to improve the learning capabilities of deep neural networks.Thus,a growing number of regulators in the electric power industry utilize such deep learning techniques with powerful recognition and detection capabilities to build their surveillance systems,which greatly reduce the risk of major accidents in daily work.However,most of the current early-warning systems are based on object detection technologies,which can only provide annotations of dangerous targets within the image,ignoring the significant information about unary relationships of electrical equipment and binary relationships between paired objects.This condition limits the capabilities of emergency recognition and forewarning.With the presence of powerful object detectors such as Faster region convolutional neural network(R-CNN)and huge visual datasets such as visual genome,visual relationship detection has attracted much attention in recent years.By utilizing the basic building blocks for single-object detection and understanding,visual relationship detection aims to not only accurately localize a pair of objects but also precisely determine the predicate between them.As a mid-level learning task,visual relationship detection can capture the detailed semantics of visual scenes by explicitly modeling objects along with their relationships with other objects.This approach bridges the gap between low-level visual tasks and high-level vision-language tasks,as well as helps machines to solve more challenging visual tasks such as image captioning,visual question answering,and image generation.However,the difficulty is in developing robust algorithms to recognize relationships between paired objects with challenging factors,such as highly diverse visual features in the same predicate category,incomplete annotation and longtailed distribution in the dataset,and optimum predicate matching problem.Although numerous methods have been proposed to build efficient relationship detectors,few of them concentrate on applying detection technologies to actual use.Method Different from existing methods,our method introduces the visual relationship detection technology into current early-warning systems.Specifically,our method not only identifies dangerous objects but also recognizes the potential unary or binary relationships that may cause an accident.To sum up,we propose a two-stage emergency recognition and forewarning system for the electric power industry.The system consists of a pre-trained object-detection module and a relationship detection module.The pipeline of our system mainly includes three stages.First,we train an object-detection module based on Faster R-CNN in advance.When given an image,the pre-trained object detector localizes all the object bounding boxes and annotates their categories.Then,the relationship-detection module integrates multiple cues(visual appearance,spatial location,and semantic embedding)to compute the predicate confidence of all the object pairs,and output the top instances as the relationship predictions.Finally,based on the targets and relationship information provided by the detectors,our system performs emergency prediction and generates a warning description that may help regulators in the electric power industry to make suitable decisions.Result We conduct several experiments to prove the efficiency and superiority of our method.First,we collect and build a dataset consisting of large amounts of images from multiple scenarios in the electric power industry.Using instructions from experts,we define and label the relationship categories that may pose risks to the images in the dataset.Then,according to the number of objects forming a relationship,we divide the dataset into two parts.Thus,our experiments involve two relevant tasks to evaluate the proposed method:unary relationship detection and binary relationship detection.For the unary relationship detection,we use precision and recall as thee valuation metrics.For the binary relationship detection,the evaluation metrics are Recall@5 and Recall@10.As our proposed relationshipdetection module contains multiple cues to learn the holistic representation of a relationship instance,we conduct ablation experiments to explore their influence on the final performance.Experiment results show that the detector that uses visual,spatial,and semantic features as input achieve the best performance of 86.80%in Recall@5 and 93.93%in Recall@10.Conclusion Extensive experiments show that our proposed method is efficient and effective in detecting defective electrical equipment and dangerous relationships between paired objects.Moreover,we formulate a pre-defined rule to generate the early-warning description according to the results of the object and relationship detectors.All of the proposed methods can help regulators take proper and timely actions to avoid harmful accidents in the electric power industry.
作者 高明 左红群 柏帆 田清阳 葛志峰 董兴宁 甘甜 Gao Ming;Zuo Hongqun;Bai Fan;Tian Qingyang;Ge Zhifeng;Dong Xingning;Gan Tian(State Grid Ninghai Power Supply Company,Ningbo 315600,China;Ninghai Yancang Mountain Electric Power Construction Company,Ningbo 315600,China;School of Computer Science and Technology,Shandong University,Qingdao 266237,China)
出处 《中国图象图形学报》 CSCD 北大核心 2021年第7期1583-1593,共11页 Journal of Image and Graphics
基金 宁波永耀电力投资集团有限公司科技项目(YYKJ202013)。
关键词 危险预警 目标检测 视觉关系检测 多模态特征融合 多标签余量损失 emergency early-warning object detection visual relationship detection multimodal feature fusion multilabel margin loss
  • 相关文献

参考文献4

二级参考文献19

  • 1王斌,楼颖稚,张肖宁.视频监控的发展及在电力系统中的应用[J].电力系统通信,2004,25(11):57-60. 被引量:13
  • 2经翔飞.输电线路施工安全管理[J].中国电力企业管理,2007(3):70-71. 被引量:18
  • 3黄立新.输配电线路实训教程[M].北京:中国电力出版社,2009.9.
  • 4MOESLUND T B. HILTON A, KRUGER V. A Survey of Advances in Vision-based Human Motion Capture and Analysis [J].Computer Vision and Image Understanding, 2006 ( 104 ):90-126.
  • 5LAO W, HAN J, WITH P H N. Automatic Surveillance Analyzer Using Trajectory and Body-based Modeling [C ]// Proc. IEEE International Conference on Consumer Electronics, Las Vegas, US: 2009.
  • 6LAO W, HAN J, WITH P H N. Automatic Video-based Hu- man Motion Analyzer for Consumer Surveillance System[J]. IEEE Trans. Consumer Electronics, 2009, 55(2) : 591-598.
  • 7ZIVKOVIC Z, VAN der HEIJDEN F. Efficient Adaptive Density Estimation per Image Pixel for the Task of Background Subtraction [J].Pattern Recognition Letters, 2006(27): 773-780.
  • 8D. COMANICIU, V. RAMESH and P. MEER, "Kernel-based Object Tracking", IEEE Trans. Pattern Analysis and Machine Intelligence, 2003(25): 564-577.
  • 9HANJ, FARIND. DE PHN et al. An Automatic Analyzer for Sports Video Databases Using Visual Cues and Real-world Modeling[C]//Consumer Electronics, 2006. ICCE06.2006 Digest of Technical Papers. International Conference on. IEEE. 2006, 477-478.
  • 10LI H, WU S, BA S, et al. Automatic Detection and Recognition of Athlete Actions in Diving Video[C]//Advances in Multimedia Modeling, Springer Berlin Heidelberg: 2006.

共引文献18

同被引文献63

引证文献7

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部