摘要
深度神经网络由于其出色的性能,被广泛地部署在各种环境下执行不同的任务,与此同时它的安全性变得越来越重要。近年来,后门攻击作为一种新型的攻击方式,对用户构成严重威胁。在训练阶段,攻击者对少量样本添加特定后门模式并标记为目标类以学习后门模型。后门模型可以以很高的概率将加入后门模式的测试样本识别为目标类,同时不影响正常样本的识别。用户通常无法掌握后门的先验信息,因此很难察觉后门攻击的存在。该文提出一种预训练模型辅助的后门样本自过滤方法,以防御后门攻击,包括目标类检测与后门样本自过滤两个部分。在第一部分,利用预训练模型提取样本特征,采用k近邻算法进行目标类检测;在第二部分,从非目标类样本中学习部分分类模型,之后多次执行“后门样本过滤”与“模型学习”的交替计算,在较好过滤后门样本的同时,也得到了完整的良性模型。
While deep neural networks(DNNs) have been widely deployed in various environments due to their excellent performances, serious security threats emerge accordingly. As a new type of attack in recent years, the backdoor attack composes one of the most serious threats which users are suffered from. The backdoor attack occurs when the attacker changes pixels in a minor amount of training images locally or globally using specific backdoor pattern called ‘trigger’,and also specifies the target label. Tested sample injected the same trigger will be classified into the target label with a high probability regardless of its ground truth, and the benign sample classification performance will not be impacted. Users usually have no prior knowledge about the backdoor attack, thereby the backdoor attack is not easy to be exposed. We propose a backdoor sample self-filtering by the aid of pre-trained model to defend against backdoor attack which contains two components: target class detection and backdoor samples’ self-filtering. At the first component, by using certain pre-trained model, feature representation is extracted for each sample, and then the k-nearest neighbor algorithm(kNN) is used to detect the target class. At the second component, a partial model is learned from the non-target class samples first, and then an iterative and alternative procedure of backdoor sample filtering and benign sample learning is conducted. Finally, not only backdoor samples are filtered out but a complete benign model is obtained as well.
作者
刘琦
张天行
陆小锋
吴汉舟
毛建华
孙广玲
LIU Qi;ZHANG Tian-xing;LU Xiao-feng;WU Han-zhou;MAO Jian-hua;SUN Guang-ling(School of Communication and Information Engineering,Shanghai University,Shanghai 200444,China)
出处
《计算机技术与发展》
2023年第1期121-129,共9页
Computer Technology and Development
基金
上海市科委科技创新行动计划项目(21511102605)
国家自然科学基金项目(61902235)。