摘要
The widespread availability of digital multimedia data has led to a new challenge in digital forensics.Traditional source camera identification algorithms usually rely on various traces in the capturing process.However,these traces have become increasingly difficult to extract due to wide availability of various image processing algorithms.Convolutional Neural Networks(CNN)-based algorithms have demonstrated good discriminative capabilities for different brands and even different models of camera devices.However,their performances is not ideal in case of distinguishing between individual devices of the same model,because cameras of the same model typically use the same optical lens,image sensor,and image processing algorithms,that result in minimal overall differences.In this paper,we propose a camera forensics algorithm based on multi-scale feature fusion to address these issues.The proposed algorithm extracts different local features from feature maps of different scales and then fuses them to obtain a comprehensive feature representation.This representation is then fed into a subsequent camera fingerprint classification network.Building upon the Swin-T network,we utilize Transformer Blocks and Graph Convolutional Network(GCN)modules to fuse multi-scale features from different stages of the backbone network.Furthermore,we conduct experiments on established datasets to demonstrate the feasibility and effectiveness of the proposed approach.
基金
This work was funded by the National Natural Science Foundation of China(Grant No.62172132)
Public Welfare Technology Research Project of Zhejiang Province(Grant No.LGF21F020014)
the Opening Project of Key Laboratory of Public Security Information Application Based on Big-Data Architecture,Ministry of Public Security of Zhejiang Police College(Grant No.2021DSJSYS002).