一种聚类分析驱动种子调度的模糊测试方法

Fuzzing Approach of Clustering Analysis-driven in Seed Scheduling

下载PDF

导出

摘要作为当前被广泛应用的自动化软件测试技术,模糊测试的首要目标是尽可能多地探索被测程序的代码区域以达到更高的覆盖率,从而检测出更多的漏洞或者错误.现有的模糊测试方法大多是根据种子的历史突变数据来调度种子,实现起来比较简单,但忽略了种子所探索程序空间的分布情况,导致测试工作可能会陷入只对程序的某单一区域进行探测,造成测试资源的浪费.提出一种基于聚类分析驱动种子调度的模糊测试方法Cluzz.首先,Cluzz结合种子执行路径覆盖的分布来分析种子在特征空间上的区别,使用聚类分析对种子在程序空间中的执行分布情况进行划分.然后,根据不同种子簇群的路径覆盖模式与聚类分析结果对种子进行优先级评估,探索稀有代码区域并优先调度评估得分较高的种子.其次,通过种子评估得分为种子分配能量,将突变得到的有趣输入保留并进行归类以更新种子簇群信息.Cluzz根据更新后的种子簇群重新评估种子,以确保测试过程中种子的有效性,从而在有限时间内探索更多的未知代码区域,提高被测程序的覆盖率.最后,将Cluzz实现在3个当前主流的模糊器上,并在8个流行的真实程序上进行大量测试工作.结果表明:Cluzz检测独特崩溃的平均数量是普通模糊器的1.7倍,在发现新边缘数量方面,平均优于基准模糊器22.15%.此外,通过与现有种子调度方法进行对比,Cluzz的综合表现要优于其他基准模糊器. As a widely used automated software testing technique,the primary goal of fuzzy testing is to explore as many code areas of the program under test as possible,thereby achieving higher coverage as well as detecting more bugs or errors.Most of existing fuzzy testing methods schedule the seed based on the historical mutation data of the seed,which is simpler to implement but ignores the distribution of program space explored by the seed,resulting in that the testing may fall into only a single region of the program to be probed,and causing the waste of testing resources.This study proposes the Cluzz,a fuzzing approach of clustering analysis-driven in seed scheduling.Firstly,Cluzz analyzes the difference between seeds in the feature space by combining the distribution of seed execution path coverage,and uses cluster analysis to classify the distribution of seeds execution in the program space.And then,Cluzz prioritizes the seeds according to the path coverage patterns of different seed clusters and the results of cluster analysis,explores the rare code regions and prioritizes the seeds with higher evaluation scores.Secondly,energy is allocated to the seeds by their evaluation scores,and the interesting inputs obtained from mutations are retained and categorized to update the seed cluster information.Cluzz reevaluates the seeds based on the updated seed clusters to ensure the validity of seeds during testing process,thereby exploring more unknown code regions in a limited time and improving the coverage of the program under test.Finally,the Cluzz is implemented on three current mainstream fuzzers and extensive testing work is conducted on eight popular real-world programs.The results show that Cluzz can detect an average of 1.7 times more unique crashes than a regular fuzzer,and it also outperforms a benchmark fuzzer by an average of 22.15%in terms of the number of new edges found.In addition,compared with the existing seed scheduling methods,the comprehensive performance of Cluzz is better than that of other benchmark fuzzers.

作者张文陈锦富蔡赛华张翅刘一松 ZHANG Wen;CHEN Jin-Fu;CAI Sai-Hua;ZHANG Chi;LIU Yi-Song(School of Computer Science and Communication Engineering,Jiangsu University,Zhenjiang 212013,China;Jiangsu Key Laboratory of Security Technology for Industrial Cyberspace(Jiangsu University),Zhenjiang,212013,China)

机构地区江苏大学计算机科学与通信工程学院江苏省工业网络安全技术重点实验室(江苏大学)

出处《软件学报》 EI CSCD 北大核心 2024年第7期3141-3161,共21页 Journal of Software

基金国家自然科学基金(62172194,62202206,U1836116) 江苏省自然科学基金(BK20220515,BK20202001) 中国博士后科学基金(2023T160275) 江苏省研究生科研与实践创新计划(KYCX21_3375,SJCX23_2092) 江苏省青蓝工程项目(2022JSDX001)。

关键词模糊测试软件安全聚类分析种子调度能量分配 fuzzing software security cluster analysis seed scheduling energy allocation

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献2

1高凤娟,王豫,司徒凌云,王林章.基于深度学习的混合模糊测试方法[J].软件学报,2021,32(4):988-1005. 被引量：16
2谢肖飞,李晓红,陈翔,孟国柱,刘杨.基于符号执行与模糊测试的混合测试方法[J].软件学报,2019,30(10):3071-3089. 被引量：19

二级参考文献3

1崔展齐,王林章,李宣东.一种目标制导的混合执行测试方法[J].计算机学报,2011,34(6):953-964. 被引量：18
2谢肖飞,李晓红,陈翔,孟国柱,刘杨.基于符号执行与模糊测试的混合测试方法[J].软件学报,2019,30(10):3071-3089. 被引量：19
3甘水滔,王林章,谢向辉,秦晓军,周林,陈左宁.一种基于程序功能标签切片的制导符号执行分析方法[J].软件学报,2019,30(11):3259-3280. 被引量：4

共引文献33

1於家伟,李世明,毕雪洁,李秋月,高胜花.基于参数约束的分支覆盖符号执行优化算法[J].信息技术与网络安全,2020,39(1):14-18.
2胡贵恒.可持续性运行软件组合测试用例的自动生成[J].辽东学院学报（自然科学版）,2020,27(2):131-134.
3叶波,陈佳斌.高效可信、灵活赋能的软件测试框架的构建与实施[J].信息技术与信息化,2020(5):17-21. 被引量：1
4许朴,舒辉,于颖超.程序敏感的模糊测试样本生成方法[J].计算机工程与设计,2020,41(12):3368-3375. 被引量：1
5刘音.基于改进遗传算法的回归测试用例优先级排序[J].计算机仿真,2021,38(2):273-277. 被引量：4
6高凤娟,王豫,司徒凌云,王林章.基于深度学习的混合模糊测试方法[J].软件学报,2021,32(4):988-1005. 被引量：16
7陈亮,李永刚,刘磊,许静,李洁.基于特征的电力信息系统注入漏洞检测方法[J].计算机工程与设计,2021,42(8):2115-2123. 被引量：7
8陈自力.一种基于K-means聚类的软件测试数据异常检测方法[J].太原师范学院学报（自然科学版）,2021,20(3):38-42. 被引量：2
9王廷永,黄松.测试用例自动生成技术综述[J].电子技术与软件工程,2021(18):51-53. 被引量：5
10张协力,祝跃飞,顾纯祥,陈熹.模型学习与符号执行结合的安全协议代码分析技术[J].网络与信息安全学报,2021,7(5):93-104. 被引量：2

1李振环,王亚玲,马红涛.大数据分析驱动下智慧学习的要素与生成机制[J].科教文汇,2024(12):37-40.
2刘秀娟.论自动化软件测试技术的实际运用[J].中文科技期刊数据库（文摘版）工程技术,2018(12):246-246.
3颜维龙.试论软件自动化测试的应用[J].中国科技期刊数据库工业A,2016(10):186-186.
4巩小凯.软件测试自动化技术研究[J].安防科技,2020(15):34-34.
5张洁,张慧.自动化软件测试技术的实际应用探讨[J].进展,2023(7):98-100.
6侯凌霄.基于POA理论的国际商务谈判开局“驱动环节”的设计与实施[J].池州学院学报,2024,38(2):151-153.
7韩冰.基于生态价值理论的黄河流域生态调度评估技术研究[J].治黄科技信息,2023(6):15-16.
8杨鹏,刘亮,张磊,刘林,李子强,贾凯.一种基于强化学习的软件安全实体关系预测方法[J].四川大学学报（自然科学版）,2024,61(4):163-171.
9徐晋.新纯粹主义:逻辑美学与语义驱动[J].西安建筑科技大学学报（社会科学版）,2024,43(3):47-53.
10王鹏全,曹生奎,李润杰.1980-2013年引胜沟流域水源涵养能力时空演变分析[J].青海师范大学学报（自然科学版）,2024,40(2):55-63.

软件学报

2024年第7期

浏览历史

内容加载中请稍等...

一种聚类分析驱动种子调度的模糊测试方法

参考文献2

二级参考文献3

共引文献33

相关作者

相关机构

相关主题

浏览历史