基于随机平滑的通用黑盒认证防御

Universal Certified Defense for Black-Box Models Based on Random Smoothing

下载PDF

导出

摘要近年来,基于深度神经网络(DNNs)的图像分类模型在人脸识别、自动驾驶等关键领域得到了广泛应用,并展现出卓越的性能.然而,深度神经网络容易受到对抗样本攻击,从而导致模型错误分类.为此,提升模型自身的鲁棒性已成为一个主要的研究方向.目前大部分的防御方法,特别是经验防御方法,都基于白盒假设,即防御者拥有模型的详细信息,如模型架构和参数等.然而,模型所有者基于隐私保护的考虑不愿意共享模型信息.即使现有的黑盒假设的防御方法,也无法防御所有范数扰动的攻击,缺乏通用性.因此,本文提出了一种适用于黑盒模型的通用认证防御方法.具体而言,本文首先设计了一个基于查询的无数据替代模型生成方案,在无需模型的训练数据与结构等先验知识的情况下,利用查询和零阶优化生成高质量的替代模型,将认证防御场景转化为白盒,确保模型的隐私安全.其次,本文提出了基于白盒替代模型的随机平滑和噪声选择方法,构建了一个能够抵御任意范数扰动攻击的通用认证防御方案.本文通过分析比较原模型和替代模型在白盒认证防御上的性能,确保了替代模型的有效性.相较于现有方法,本文提出的通用黑盒认证防御方案在CIFAR10数据集上的效果取得了显著的提升.实验结果表明,本文方案可以保持与白盒认证防御方法相似的效果.与之前基于黑盒的认证防御方法相比,本文方案在实现了所有L p的认证防御的同时,认证准确率提升了20%以上.此外,本文方案还能有效保护原始模型的隐私,与原始模型相比,本文方案使成员推理攻击的成功率下降了5.48%. In recent years,the widespread application of image classification models based on deep neural networks(DNNs)has significantly impacted critical fields,including facial recognition and autonomous driving.These models have showcased remarkable performance,revolutionizing the way we interact with technology.However,despite their success,deep neural networks are not without vulnerabilities,particularly in the face of adversarial attacks,which can lead to misclassification and compromise the integrity of these models.Addressing this challenge has become a pivotal research direction,as ensuring the robustness of these models is essential for their real-world deployment.Currently,many defense methods,especially empirical ones,operate under the white-box assumption.This assumption relies on defenders having access to detailed information about the model,including its architecture and parameters.Unfortunately,model owners often hesitate to share such sensitive information due to privacy concerns.Even existing black-box defense methods struggle to provide comprehensive protection against attacks involving all norms,lacking the necessary universality.This inherent limitation has spurred the need for innovative solutions.In response to this challenge,this paper proposes a groundbreaking universal black-box certified defense method applicable to a broad spectrum of black-box models.The key innovation lies in the design of a query-based data-free substitute model generation scheme.Unlike traditional methods,this scheme eliminates the need for training data and prior knowledge of the model structure.Leveraging queries and zero-order optimization,it generates high-quality substitute models,effectively transforming the certified defense scenario into a white-box setting without compromising model privacy.Furthermore,this paper introduces additional layers of security through the incorporation of random smoothing and noise selection methods based on the white-box substitute model.These enhancements contribute to the construction of a universal certified defense solution capable of resisting adversarial attacks involving any norm.To validate the effectiveness of the substitute model,performance comparisons are made with the original model under white-box certified defense conditions.The experimental results,particularly on the CIFAR10 dataset,showcase the superiority of the proposed universal black-box certified defense solution over existing methods.The solution not only achieves significant improvements in certification accuracy but also maintains similar performance to white-box certified defense methods.Notably,compared to previous black-box certified defense methods,the proposed solution demonstrates over a 20%improvement in certification accuracy while effectively safeguarding the privacy of the original model.Specifically,the proposed solution successfully reduces the success rate of membership inference attacks by 5.48%,further highlighting its robustness and practical applicability in real-world scenarios.

作者李瞧陈晶张子君何琨杜瑞颖汪欣欣 LI Qiao;CHEN Jing;ZHANG Zi-Jun;HE Kun;DU Rui-Ying;WANG Xin-Xin(Key Laboratory of Aerospace Information Security and Trusted Computing,Ministry of Education,School of Cyber Science and Engineering,Wuhan University,Wuhan430079;Institute of Information Technology,Wuhan University,Rizhao,Shandong276800;Collaborative Innovation Center of Geospatial Technology,Wuhan430079)

机构地区空天信息安全与可信计算教育部重点实验室武汉大学国家网络安全学院武汉大学日照信息研究院武汉大学地球空间信息技术协同创新中心

出处《计算机学报》 EI CAS CSCD 北大核心 2024年第3期690-702,共13页 Chinese Journal of Computers

基金国家重点研发计划(2022YFB3102100) 国家自然科学基金(62206203,62076187) 湖北省重点研发计划(2022BAA039) 山东省重点研发计划(2022CXPT055) 武汉市科技计划(2023010302020707)资助.

关键词深度神经网络认证防御随机平滑黑盒模型替代模型 deep neural networks certified defense random smoothing black-box models substitute models

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1徐钰琪,毛云龙,仲盛.针对深度学习目标检测模型的对抗样本攻击:方法、迁移与防御[J].科技纵览,2024(1):68-69.
2杨志芳.ChatGPT2.0及衍生品对媒体传播方式的改变[J].中国报业,2024(4):18-19.
3徐连瑞,游雄.任务驱动视角下机器地图现状与发展[J].武汉大学学报（信息科学版）,2024,49(4):609-623.
4李世宝,王杰伟,崔学荣,刘建航,黄庭培.基于图像着色的无限制攻击[J].计算机与现代化,2022(11):52-59.
5顾登华,顾春华.基于注意力机制和图像轮廓的实例分割算法[J].电子科技,2024,37(4):62-68.
6李卓天.张家口洋河超标洪水防御体系的研究[J].河南水利与南水北调,2023,52(12):11-12.
7王玉叶,刘继梁,朱欣雨.基于大数据分析的用户仓储需求预测方法[J].中国物流与采购,2024(6):77-78.
8郭晶晶,刘玖樽,马勇,刘志全,熊宇鹏,苗可,李佳星,马建峰.基于模型水印的联邦学习后门攻击防御方法[J].计算机学报,2024,47(3):662-676.
9吴利刚,陈乐,周倩,史建华,马宇波.基于轻量化高效层聚合网络的黄花成熟度检测方法[J].农业机械学报,2024,55(2):268-277.
10陶炜,沈阳.从ChatGPT到Sora:面向AIGC的四能教育和范式革新[J].现代教育技术,2024,34(4):16-27. 被引量：1

计算机学报

2024年第3期

浏览历史

内容加载中请稍等...

基于随机平滑的通用黑盒认证防御

相关作者

相关机构

相关主题

浏览历史