一种约束制导的机器学习框架漏洞检测方法

Constraint-Guided Vulnerability Detection Techniques for Machine Learning Framework

下载PDF

导出

摘要随着机器学习在社会各领域中自主决策场景的广泛应用,人们对机器学习框架中潜在漏洞的担忧也在日益增加.然而,由于其复杂的实现,针对框架的系统化、自动化测试成为一项艰巨的任务.现有对机器学习框架测试的研究在生成有效测试数据方面尚不成熟,导致测试数据无法通过合法性校验并因此无法检测到目标漏洞.本文提出了ConFL,一种基于约束的机器学习框架模糊测试工具.ConFL能够自动从框架源代码中提取约束而无需任何先验知识.在约束的指导下,ConFL可以生成能够通过校验的有效输入,并执行到框架更深层次的代码逻辑.此外,本文设计了一种算子分组调度技术来提高模糊测试的效率.为了证明ConFL的有效性,本文主要在Tensor-Flow框架上评估了其性能.测试发现,与现有的SOTA工具相比,ConFL能够覆盖更多的代码行,并生成更多有效的测试数据;在相同版本的TensorFlow框架上,ConFL能检测出更多的已知漏洞.此外,ConFL在不同版本的TensorFlow中发现了84个未知漏洞,这些漏洞全部被官方修复并被分配了CVE编号,其中包括3个严重漏洞,13个高危漏洞.最后,本文还在PyTorch和PaddlePaddle中进行了通用性测试,迄今为止发现了7个漏洞. The increasing integration of machine learning(ML)in various sectors for decision-making automation brings to light significant concerns regarding the vulnerabilities in ML frame-works.Such vulnerabilities pose a considerable risk,potentially undermining the integrity and reliability of ML applications in critical areas.Testing these frameworks,however,is notably challenging due to their complex implementations.The intricacy of these systems often masks vulnerabilities,making them difficult to detect with conventional methods.Historically,fuzzing ML frameworks has been met with limited success.The primary challenge in this area has been the effective extraction of input constraints and the generation of valid inputs.Traditional approa-ches often result in prolonged fuzzing periods,which are not only inefficient but also insufficient in reaching the deeper,more complex execution paths where critical vulnerabilities might lie.In response to these challenges,our paper introduces ConFL(Constraint Fuzzy Lop),a novel,con-straint-guided fuzzer designed specifically for ML frameworks.ConFL marks a significant advancement in the field of ML framework testing.Its ability to automatically extract constraints from source codes is a groundbreaking feature.This automation is particularly beneficial as it eliminates the need for prior knowledge of the framework's inner workings,thus democratizing the testing process.The constraint-guided approach of ConFL is instrumental in generating valid inputs that are more likely to pass through the initial layers of verification in ML frameworks.This capability enables ConFL to delve deeper into the operator code's pathways,thus uncove-ring vulnerabilities that would otherwise remain hidden in traditional testing methods.Moreover,ConFL innovates with a unique grouping technique designed to enhance fuzzing efficiency.This technique organizes the testing process in a more structured manner,allowing for a more thor-ough and systematic exploration of the framework's vulnerabilities.Our evaluation of ConFL's performance,primarily on the TensorFlow framework,has yielded impressive results.ConFL demonstrates a superior capability in covering more code lines and generating a greater number of valid inputs compared to state-of-the-art(SOTA)fuzzers.This increased efficiency is crucial in the practical application of fuzzing in ML frameworks,as it translates to more robust and secure ML applications.In the realm of known vulnerabilities within the TensorFlow framework,Con-FL has shown exceptional prowess.It has successfully detected a larger number of vulnerabilities than existing fuzzers.But perhaps more importantly,ConFL has identified 84 previously unknown vulnerabilities across various versions of TensorFlow.These newly discovered vulnera-bilities,which include 3 of critical severity and 13 of high severity,have been significant enough to warrant new CVE(Common Vulnerabilities and Exposures)ids.The versatility of ConFL is further demonstrated by its application to other ML frameworks such as PyTorch and Paddle.In these frameworks,ConFL has already identified 7 vulnerabilities,indicating its potential as a universal tool for ML framework testing.In conclusion,ConFL represents a significant step forward in securing ML frameworks.Its automated,constraint-guided approach not only makes the fuzzing process more efficient but also more effective in uncovering deep-seated vulnerabili-ties.As ML continues to permeate various sectors,tools like ConFL will be vital in ensuring the security and reliability of ML-driven systems.

作者刘昭邹权臣于恬王旋张德岳孟国柱陈恺 LIU Zhao;ZOU Quan-Chen;YU Tian;WANG Xuan;ZHANG De-Yue;MENG Guo-Zhu;CHEN Kai(AI Security Lab,Qihoo 360,Beijing 100015;Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100195)

机构地区北京奇虎科技有限公司AI安全实验室中国科学院信息工程研究所

出处《计算机学报》 EI CAS CSCD 北大核心 2024年第5期1120-1137,共18页 Chinese Journal of Computers

基金国家科技创新2030--“新一代人工智能”重大项目(2020AAA0104300)资助.

关键词机器学习框架约束提取算子测试模糊测试漏洞检测 machine learning framework constraints extraction operator testing fuzzing vul-nerability detection

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1朱明浩,李君,李正权,仲星,张茜茜,沈国丽.无线通信中基于卷积神经网络的负载均衡系统[J].计算机与数字工程,2023,51(9):2007-2012.
2杜婉,董天悦,罗玉玲.强化学习在自动驾驶换道研究中的应用[J].内燃机与配件,2024(12):102-104.
3车德敏,姜俊秋.电力系统的安全漏洞扫描与网络防护策略[J].自动化应用,2024,65(6):215-217.
4况博裕,张兆博,杨善权,苏铓,付安民.HMFuzzer:一种基于人机协同的物联网设备固件漏洞挖掘方案[J].计算机学报,2024,47(3):703-716. 被引量：1
5宋亮.基于“互联网+”的个人信用应用前沿[J].中国科技期刊数据库科研,2016(9):20-20.
6张玉缺.数智化背景下基于企业行为的复杂决策场景财务研究[J].财会通讯,2024(14):108-112.
7朱勇,胡琬莹.中国文学走出去的重要路径——外向型文学分级读物的编写与出版[J].汉语国际传播研究,2023(1):115-122.
8陈拥军,周晓敏,陈微,陈泓伶.基于旁路接入的医院服务器漏洞屏蔽系统研究[J].中国数字医学,2024,19(7):102-105.
9Ljiljana Fodor Duric,Bozidar Vujicic,Tonko Gulin,Matko Gulin.The Impact of Finerenone on Changes in Pulse Wave Velocity, Arterial Pressure and Heart Related Deaths in Hemodialysis Patients—Study Perspective[J].Open Journal of Nephrology,2024,14(2):216-225.

计算机学报

2024年第5期

浏览历史

内容加载中请稍等...

一种约束制导的机器学习框架漏洞检测方法

相关作者

相关机构

相关主题

浏览历史