结合金字塔Transformer与浅层CNN的变电站图像篡改检测

Pyramid Transformer combined with shallow CNN forsubstation image tampering detection

导出

摘要目的变电站图像拼接篡改是电力系统的一大安全隐患,针对篡改图像背景复杂、篡改内容尺度不一造成的误检漏检问题以及相关研究较少,本文提出一种面向变电站的拼接篡改图像的双通道检测模型。方法两通道均采用深度学习方法自适应提取篡改图像和残差图像的特征,其中篡改图像包含丰富的色彩特征和内容信息,残差图像重点凸显了篡改区域的边缘,有效应对了篡改图像多样性导致的篡改特征提取困难问题;将特征金字塔结构Transformer通道作为网络主分支,通过全局交互机制获取图像全局信息,建立关键点之间的联系,使模型具备良好的泛化性和多尺度特征处理能力;引入浅层卷积神经网络(convolutional neural network, CNN)通道作为辅助分支,着重提取篡改区域的边缘特征,使模型在整体轮廓上更容易定位篡改区域。结果实验在自制变电站拼接篡改数据集(self-made substation splicing tampered dataset, SSSTD)、CASIA(Chinese Academy of Sciences Institute of Automation dataset)和NIST16(National Institute of Standards and Technology 16)上与4种同类型方法进行比较。定量上看,在SSSTD数据集中,本文模型相对性能第2的模型在精确率、召回率、F1和平均精度上分别提高了0.12%、2.17%、1.24%和7.71%;在CASIA和NIST16数据集中,本文模型也取得了最好成绩。定性上看,所提模型减少了误检和漏检,同时定位精度更高。结论本文提出的双通道拼接篡改检测模型结合了Transformer和CNN在图像篡改检测方面的优势,提高了模型的检测精度,适用于复杂变电站场景下的篡改目标检测。 Objective Image information becomes particularly important with the widespread application of intelligentpower inspection.However,the rapid development of image tampering technology provides unscrupulous elements with anew way to harm power systems.As an important component of power systems,substations are responsible for the intercon⁃version of different voltage levels.Ensuring the full-time output of stable voltage and the reasonable use of substationresources is the basis for the safe and stable operation of an entire power network.However,if the collected substation images are maliciously tampered with,then this condition may not only cause the failure of a smart grid system but alsomake operators misjudge the actual situation of the substation,eventually leading to power system failure and may evencause major accidents,such as large-scale power outages,resulting in irreversible losses to national production.There⁃fore,detecting tampered images of substations is a key task in ensuring the stability of power systems.The complex back⁃ground of tampered images and the different scales of tampered contents cause existing detection models to experience theproblems of false detection and leakage detection.Meanwhile,related research on image splicing tampering detection inpower scenes is lacking.Accordingly,this study proposes a dual-channel detection model for splicing tampered images insubstation scenes.Method The model consists of three parts:a Transformer channel with a feature pyramid structure,ashallow convolutional neural network(CNN)channel,and a network head.The size of the input tampered image is 512×512×3,and the output is the detection and localization results of the tampered image.Both channels use deep learningmethods to extract features of the original color image and the residual image adaptively.The original color image containsrich color features and content information,while the residual image focuses on highlighting the edges of the tamperedregion,effectively solving the problem of difficult extraction of tampered features caused by the diversity of tamperedimages.In this study,the feature pyramid structure Transformer channel is used as the primary feature extraction channel,which consists of the pyramid structure Transformer and a progressive local decoder(PLD).The Transformer can effi⁃ciently extract features and establish connections between feature points via global attention from the first layer of the modelin the global sensory field.Meanwhile,the use of the pyramid structure provides the network with better generalization andmulti-scale feature processing capability.PLD enables features with different depths and expressiveness to guide and fusewith one another,solving the problems of attention scattering and the underestimation of local features to improve detail pro⁃cessing capability.The shallow CNN channel is used as an auxiliary detection channel,while the shallow network is usedto extract the edge features of the tampered region in the residual image,enabling the model to locate the tampered regionmore easily in the overall contour.The residual block is the residual network module that forms the backbone of the shallownetwork.Its input is the residual image generated from the tampered image through the high-pass filtering layer.The paral⁃lel axial attention block introduces different sizes of dilated convolution to increase the perceptual field of the shallow net⁃work,and the parallel axial attention mechanism helps the network extract contextual semantic information.The features oftwo tributaries are fused into the network head by the channel,and the experiments conducted in this study show that merg⁃ing by the channel is more effective than accumulation by elements.Finally,the network head detects the presence orabsence of tampered regions in the image and accurately locates them.Result The experiments are first conducted on thepretraining datasets and pretraining weights are obtained.The test results show that the model in this study exhibits gooddetection effect on various tampering targets.The model is fine-tuned on the basis of the pretraining weights and comparedwith four models of the same type on the self-made substation splicing tampered dataset(SSSTD),CASIA,and NIST16.Four evaluation metrics,namely,accuracy,recall,F1,and average accuracy,are selected for quantitative analysis.InSSSTD,the accuracy,recall,F1,and average precision indexes of this study’s model improved by 0.12%,2.17%,1.24%,and 7.71%,respectively,compared with the model with the 2nd highest performance.In CASIA,this study’smodel still achieves the best results in the four evaluation indexes.In NIST16,various detection models achieve higher val⁃ues in accuracy,and this study’s model achieves higher values in recall rate.F1 and average precision indexes are sub⁃stantially improved compared with the four comparison models.Qualitatively,the proposed model mitigates the problems offalse detection and missed detection,while achieving higher localization accuracy.The overall detection effect is betterthan the other models.Conclusion The detection of tampered substation image splicing is a key task in ensuring the stabil⁃ity of a power system.This study designs a new complex substation image splicing tampering detection model based on afeature pyramid structure Transformer and a shallow CNN dual channels.The feature pyramid structure Transformer chan⁃nel obtains rich semantic information and visual features of tampered images through the global interaction mechanism,enhancing the accuracy and multi-scale processing capability of the detection model.As an auxiliary channel,the shallowCNN focuses on extracting residual image edge features,making it easier for the model to locate tampered regions in theoverall contour.The models are measured on different splicing tampering datasets,and all the models in this study achieveoptimal results.The visualization further shows that the model in this study exhibits the best detection effect in the actual substation scenario.However,this work only investigates image splicing tampering detection,while diverse types of tam⁃pering occur in reality.The next step is to investigate other types of tampered image detection to improve the compatibilityof tampering detection models.

作者邢建好田秀霞韩奕 Xing Jianhao;Tian Xiuxia;Han Yi(College of Computer Science and Technology,Shanghai University of Electric Power,Shanghai 201306,China;College of Electronics and Information Engineering,Shanghai University of Electric Power,Shanghai 201306,China)

机构地区上海电力大学计算机科学与技术学院上海电力大学电子与信息工程学院

出处《中国图象图形学报》 CSCD 北大核心 2024年第2期444-456,共13页 Journal of Image and Graphics

基金国家自然科学基金项目(61772327)。

关键词变电站图像拼接篡改检测 TRANSFORMER 卷积神经网络(CNN) 双通道网络特征金字塔结构浅层网络 substation image splicing tampering detection Transformer convolutional neural network(CNN) dualchannel network feature pyramid structure shallow network

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献6

1陈港,张石清,赵小明.采用Transformer网络的视频序列表情识别[J].中国图象图形学报,2022,27(10):3022-3030. 被引量：4
2李颖,边山,王春桃,卢伟.CNN结合Transformer的深度伪造高效检测[J].中国图象图形学报,2023,28(3):804-819. 被引量：5
3刘正,田秀霞,白万荣.面向电力场景的双通道图像拼接窜改检测模型[J].计算机应用研究,2022,39(4):1218-1223. 被引量：2
4田秀霞,李华强,张琴,周傲英.基于双通道R-FCN的图像篡改检测模型[J].计算机学报,2021,44(2):370-383. 被引量：11
5王振学,许喆铭,雪洋洋,郎丛妍,李尊,魏莉莉.融合全局与空间多尺度上下文信息的车辆重识别[J].中国图象图形学报,2023,28(2):471-482. 被引量：3
6吴旭,刘翔,赵静文.一种轻量级多尺度融合的图像篡改检测算法[J].计算机工程,2022,48(2):224-229. 被引量：4

二级参考文献21

1骆伟祺,黄继武,丘国平.鲁棒的区域复制图像篡改检测技术[J].计算机学报,2007,30(11):1998-2007. 被引量：65
2李立春,张小虎,刘晓春,桂阳,尚洋,李强,于起峰.“华南虎”照片的摄像测量研究[J].科技导报,2008,26(1):59-67. 被引量：9
3刘帅师,田彦涛,万川.基于Gabor多方向特征融合与分块直方图的人脸表情识别方法[J].自动化学报,2011,37(12):1455-1463. 被引量：76
4付晓峰,付晓鹃,李建军,余正生.视频序列中基于多尺度时空局部方向角模式直方图映射的表情识别[J].计算机辅助设计与图形学学报,2015,27(6):1060-1066. 被引量：7
5李彦冬,郝宗波,雷航.卷积神经网络研究综述[J].计算机应用,2016,36(9):2508-2515. 被引量：546
6袁亚飞,卢伟,冯丙文,翁健.基于多预训练模型的在线隐写盲分析系统研究与实现[J].网络与信息安全学报,2017,3(5):32-37. 被引量：1
7陈超,齐峰.卷积神经网络的发展及其在计算机视觉领域中的应用综述[J].计算机科学,2019,46(3):63-73. 被引量：136
8王晓华,潘丽娟,彭穆子,胡敏,金春花,任福继.基于层级注意力模型的视频序列表情识别[J].计算机辅助设计与图形学学报,2020,32(1):27-35. 被引量：3
9熊士婷,张玉金,吴飞,刘婷婷.基于统计噪声水平分析的图像拼接检测[J].光电子．激光,2020,31(2):214-221. 被引量：5
10杨超,周大可,杨欣.基于篡改区域轮廓的图像拼接篡改盲取证算法[J].电子测量技术,2020,43(4):132-138. 被引量：2

共引文献22

1丁志江,李丹,马志程,张宝龙.基于Transformer的车道线分割算法研究[J].电子测量与仪器学报,2022,36(10):227-234. 被引量：4
2朱新同,唐云祁,耿鹏志.基于特征融合的篡改与深度伪造图像检测算法[J].信息网络安全,2021(8):70-81. 被引量：11
3朱叶,余宜林,郭迎春.HRDA-Net:面向真实场景的图像多篡改检测与定位算法[J].通信学报,2022,43(1):217-226. 被引量：4
4刘正,田秀霞,白万荣.面向电力场景的双通道图像拼接窜改检测模型[J].计算机应用研究,2022,39(4):1218-1223. 被引量：2
5袁单飞,陈慈发,董方敏.基于多尺度分割的图像识别残差网络研究[J].计算机工程,2022,48(5):258-262. 被引量：3
6刘亚奇,许盛伟.数字图像篡改定位研究综述[J].北京电子科技学院学报,2022,30(3):41-54. 被引量：1
7吕建凯,卢望龙,王敏,刘影,史开杰,黄辉,赵汉理.基于多尺度特征先验的图像拼接篡改检测[J].中国科技论文,2022,17(11):1267-1275.
8谢誉,包梓群,张娜,吴彪,涂小妹,包晓安.基于特征优化与深层次融合的目标检测算法[J].浙江大学学报（工学版）,2022,56(12):2403-2415. 被引量：3
9马海荣,冯天晶,戢锐.基于FCN和面向对象的高分辨率遥感影像土地覆盖分类[J].湖北农业科学,2022,61(22):163-168. 被引量：2
10张玉林,王宏霞,张瑞,张婧媛.语义一致性引导的多任务拼接篡改检测[J].中国图象图形学报,2023,28(3):775-788. 被引量：1

1方旺盛,陈小冬.基于SVD的分块半脆弱水印医学图像算法研究[J].软件导刊,2024,23(2):113-119.
2吴晶辉,严彩萍,李红,刘仁海.边缘引导的双注意力图像拼接检测网络[J].中国图象图形学报,2024,29(2):430-443.
3白玉萍,任淑华,罗勤力.技能型社会背景下高等职业教育人才培养路径优化[J].创新创业理论研究与实践,2024(4):96-98.
4中国体育:弘扬中华体育精神凝聚时代奋进力量[J].学生天地（小学中高年级）,2023(11):13-15.
5董纪昌,祝魏玮,张超,贺舟.科学传播模式演变与科技创新交互机制的探析和思考[J].中国科学院院刊,2024,39(2):358-366.
6杨昆,刘通,柏林,侯祖锋,郭晓燕,周承启.基于谈判博弈的微电网群多主体共享储能容量优化配置策略[J].电测与仪表,2024,61(3):33-41. 被引量：1
7唐善成,逯建辉,张莹,金子成,赵安新.修复缺陷嫌疑区域的无监督磁瓦表面缺陷检测[J].浙江大学学报（工学版）,2024,58(4):718-728.
8周琨,徐洋,魏洁,吴泽彬,韦志辉.约束能量最小化变分自编码的高光谱目标检测[J].遥感学报,2024,28(1):78-87.
9Zhi-Hui Duan,Sheng-Yun Zhou.Biopsy forceps are useful for measuring esophageal varices in vitro[J].World Journal of Gastrointestinal Surgery,2024,16(2):539-545.
10邹克,王尧.科技金融政策何以影响科技产业集聚发展?[J].南京财经大学学报,2024(1):1-11. 被引量：1

中国图象图形学报

2024年第2期

浏览历史

内容加载中请稍等...

结合金字塔Transformer与浅层CNN的变电站图像篡改检测

参考文献6

二级参考文献21

共引文献22

相关作者

相关机构

相关主题

浏览历史