期刊文献+

结合金字塔Transformer与浅层CNN的变电站图像篡改检测

Pyramid Transformer combined with shallow CNN forsubstation image tampering detection
原文传递
导出
摘要 目的 变电站图像拼接篡改是电力系统的一大安全隐患,针对篡改图像背景复杂、篡改内容尺度不一造成的误检漏检问题以及相关研究较少,本文提出一种面向变电站的拼接篡改图像的双通道检测模型。方法 两通道均采用深度学习方法自适应提取篡改图像和残差图像的特征,其中篡改图像包含丰富的色彩特征和内容信息,残差图像重点凸显了篡改区域的边缘,有效应对了篡改图像多样性导致的篡改特征提取困难问题;将特征金字塔结构Transformer通道作为网络主分支,通过全局交互机制获取图像全局信息,建立关键点之间的联系,使模型具备良好的泛化性和多尺度特征处理能力;引入浅层卷积神经网络(convolutional neural network, CNN)通道作为辅助分支,着重提取篡改区域的边缘特征,使模型在整体轮廓上更容易定位篡改区域。结果 实验在自制变电站拼接篡改数据集(self-made substation splicing tampered dataset, SSSTD)、CASIA(Chinese Academy of Sciences Institute of Automation dataset)和NIST16(National Institute of Standards and Technology 16)上与4种同类型方法进行比较。定量上看,在SSSTD数据集中,本文模型相对性能第2的模型在精确率、召回率、F1和平均精度上分别提高了0.12%、2.17%、1.24%和7.71%;在CASIA和NIST16数据集中,本文模型也取得了最好成绩。定性上看,所提模型减少了误检和漏检,同时定位精度更高。结论 本文提出的双通道拼接篡改检测模型结合了Transformer和CNN在图像篡改检测方面的优势,提高了模型的检测精度,适用于复杂变电站场景下的篡改目标检测。 Objective Image information becomes particularly important with the widespread application of intelligentpower inspection.However,the rapid development of image tampering technology provides unscrupulous elements with anew way to harm power systems.As an important component of power systems,substations are responsible for the intercon⁃version of different voltage levels.Ensuring the full-time output of stable voltage and the reasonable use of substationresources is the basis for the safe and stable operation of an entire power network.However,if the collected substation images are maliciously tampered with,then this condition may not only cause the failure of a smart grid system but alsomake operators misjudge the actual situation of the substation,eventually leading to power system failure and may evencause major accidents,such as large-scale power outages,resulting in irreversible losses to national production.There⁃fore,detecting tampered images of substations is a key task in ensuring the stability of power systems.The complex back⁃ground of tampered images and the different scales of tampered contents cause existing detection models to experience theproblems of false detection and leakage detection.Meanwhile,related research on image splicing tampering detection inpower scenes is lacking.Accordingly,this study proposes a dual-channel detection model for splicing tampered images insubstation scenes.Method The model consists of three parts:a Transformer channel with a feature pyramid structure,ashallow convolutional neural network(CNN)channel,and a network head.The size of the input tampered image is 512×512×3,and the output is the detection and localization results of the tampered image.Both channels use deep learningmethods to extract features of the original color image and the residual image adaptively.The original color image containsrich color features and content information,while the residual image focuses on highlighting the edges of the tamperedregion,effectively solving the problem of difficult extraction of tampered features caused by the diversity of tamperedimages.In this study,the feature pyramid structure Transformer channel is used as the primary feature extraction channel,which consists of the pyramid structure Transformer and a progressive local decoder(PLD).The Transformer can effi⁃ciently extract features and establish connections between feature points via global attention from the first layer of the modelin the global sensory field.Meanwhile,the use of the pyramid structure provides the network with better generalization andmulti-scale feature processing capability.PLD enables features with different depths and expressiveness to guide and fusewith one another,solving the problems of attention scattering and the underestimation of local features to improve detail pro⁃cessing capability.The shallow CNN channel is used as an auxiliary detection channel,while the shallow network is usedto extract the edge features of the tampered region in the residual image,enabling the model to locate the tampered regionmore easily in the overall contour.The residual block is the residual network module that forms the backbone of the shallownetwork.Its input is the residual image generated from the tampered image through the high-pass filtering layer.The paral⁃lel axial attention block introduces different sizes of dilated convolution to increase the perceptual field of the shallow net⁃work,and the parallel axial attention mechanism helps the network extract contextual semantic information.The features oftwo tributaries are fused into the network head by the channel,and the experiments conducted in this study show that merg⁃ing by the channel is more effective than accumulation by elements.Finally,the network head detects the presence orabsence of tampered regions in the image and accurately locates them.Result The experiments are first conducted on thepretraining datasets and pretraining weights are obtained.The test results show that the model in this study exhibits gooddetection effect on various tampering targets.The model is fine-tuned on the basis of the pretraining weights and comparedwith four models of the same type on the self-made substation splicing tampered dataset(SSSTD),CASIA,and NIST16.Four evaluation metrics,namely,accuracy,recall,F1,and average accuracy,are selected for quantitative analysis.InSSSTD,the accuracy,recall,F1,and average precision indexes of this study’s model improved by 0.12%,2.17%,1.24%,and 7.71%,respectively,compared with the model with the 2nd highest performance.In CASIA,this study’smodel still achieves the best results in the four evaluation indexes.In NIST16,various detection models achieve higher val⁃ues in accuracy,and this study’s model achieves higher values in recall rate.F1 and average precision indexes are sub⁃stantially improved compared with the four comparison models.Qualitatively,the proposed model mitigates the problems offalse detection and missed detection,while achieving higher localization accuracy.The overall detection effect is betterthan the other models.Conclusion The detection of tampered substation image splicing is a key task in ensuring the stabil⁃ity of a power system.This study designs a new complex substation image splicing tampering detection model based on afeature pyramid structure Transformer and a shallow CNN dual channels.The feature pyramid structure Transformer chan⁃nel obtains rich semantic information and visual features of tampered images through the global interaction mechanism,enhancing the accuracy and multi-scale processing capability of the detection model.As an auxiliary channel,the shallowCNN focuses on extracting residual image edge features,making it easier for the model to locate tampered regions in theoverall contour.The models are measured on different splicing tampering datasets,and all the models in this study achieveoptimal results.The visualization further shows that the model in this study exhibits the best detection effect in the actual substation scenario.However,this work only investigates image splicing tampering detection,while diverse types of tam⁃pering occur in reality.The next step is to investigate other types of tampered image detection to improve the compatibilityof tampering detection models.
作者 邢建好 田秀霞 韩奕 Xing Jianhao;Tian Xiuxia;Han Yi(College of Computer Science and Technology,Shanghai University of Electric Power,Shanghai 201306,China;College of Electronics and Information Engineering,Shanghai University of Electric Power,Shanghai 201306,China)
出处 《中国图象图形学报》 CSCD 北大核心 2024年第2期444-456,共13页 Journal of Image and Graphics
基金 国家自然科学基金项目(61772327)。
关键词 变电站图像 拼接篡改检测 TRANSFORMER 卷积神经网络(CNN) 双通道网络 特征金字塔结构 浅层网络 substation image splicing tampering detection Transformer convolutional neural network(CNN) dualchannel network feature pyramid structure shallow network
  • 相关文献

参考文献6

二级参考文献21

共引文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部