摘要
反后门学习方法(anti-backdoor learning,ABL)在利用中毒数据集进行模型训练过程中能实时检测并抑制后门生成,最终得到良性模型。但反后门学习方法存在后门样本和良性样本无法有效隔离、后门消除效率不高的问题。为此,提出遗忘学习前置的反后门学习方法(anti-backdoor learning method based on preposed unlearning,ABLPU),在隔离阶段对训练样本增加提纯操作,达到有效隔离良性样本的目标,在消除阶段采用后门遗忘-模型再训练的范式,并引入遗忘系数,实现后门的高效消除。在CIFAR-10数据集上针对后门攻击方法BadNets,遗忘学习前置的反后门学习方法较反后门学习方法(基线方法)良性准确率提高1.21个百分点,攻击成功率下降1.38个百分点。
The anti-backdoor learning(ABL)method can detect and suppress backdoor generation in real time during model training with poisoned datasets,and finally obtain a benign model.However,the ABL method suffers from the problem that the backdoor samples and benign samples cannot be effectively isolated and the efficiency of backdoor elimination is not high.To this end,an anti-backdoor learning method based on preposed unlearning(ABL-PU)is proposed,which adds a purification operation to the training samples in the isolation phase to achieve the goal of effective isolation of benign samples,and adopts a paradigm of backdoor unlearning and model retraining in the elimination phase,and introduces unlearning coefficients to achieve efficient backdoor elimination.On the CIFAR-10 dataset,against the classical backdoor attack method BadNets,the anti-backdoor learning method based on preposed unlearning improves the benign accuracy rate by 1.21 percentage points and decreases the attack success rate by 1.38 percentage points compared with the anti-backdoor learning method(the baseline method).
作者
王晗旭
李欣
许文韬
斯彬洲
WANG Hanxu;LI Xin;XU Wentao;SI Binzhou(School of Information Network Security,People’s Public Security University of China,Beijing 100038,China;Key Laboratory of Security Prevention Technology and Risk Assessment of the Ministry of Public Security,Beijing 100026,China)
出处
《计算机工程与应用》
CSCD
北大核心
2024年第19期259-267,共9页
Computer Engineering and Applications
基金
中国人民公安大学网络空间安全执法技术双一流创新研究专项(2023SYL07)。
关键词
后门攻击
反后门学习
数据提纯
遗忘学习前置
遗忘系数
backdoor attacks
anti-backdoor learning
data purification
preposed unlearning
unlearning coefficient