期刊文献+

面向C++商业软件二进制代码中的类信息恢复技术

Class Information Recovery Technology for COTS C++Binary
下载PDF
导出
摘要 采用C++编写的软件一直是二进制逆向分析中的高难度挑战,二进制代码中不再保留C++中的类及其继承信息,尤其是正式发布的软件缺省开启编译优化,导致残留的信息也被大幅削减,使得商业软件(Commercial-Off-The-Shelf,COTS)的C++二进制逆向分析尤其困难。当前已有的研究工作一是没有充分考虑编译优化,导致编译优化后类及其继承关系的识别率很低,难以识别虚继承等复杂的类间关系;二是识别算法执行效率低,无法满足大型软件的逆向分析。本文围绕编译优化下的C++二进制代码中类及其继承关系的识别技术开展研究,在三个方面做出了改进。第一,利用过程间静态污点分析从C++二进制文件中提取对象的内存布局,有效抵抗编译优化的影响(构造函数内联);第二,引入了四种启发式方法,可从编译优化后的C++二进制文件中恢复丢失的信息;第三,研发了一种自适应CFG(控制流图)生成算法,在极小损失的情况下大幅度提高分析的效率。在此基础上实现了一个原型系统RECLASSIFY,它可以从C++二进制代码中有效识别多态类和类继承关系(包括虚继承)。实验表明,在MSVCABI和ItaniumABI下,RECLASSIFY均能在较短时间内从优化后二进制文件中识别出大多数多态类、恢复类关系。在由15个真实软件中的C++二进制文件组成的数据集中(O2编译优化),RECLASSIFY在MSVC ABI下恢复多态类的平均召回率为84.36%,而之前最先进的解决方案OOAnalyzer恢复多态类的平均召回率仅为33.76%。除此之外,与OOAnalyzer相比,RECLASSIFY的分析效率提高了三个数量级。 Software written in C++has always been a difficult challenge in binary reverse analysis.Binary code no longer retains the classes and their information in C++,especially Commercial-Off-The-Shelf(COTS)enables compiler optimi-zation by default,resulting in significant reduction of residual information.It makes COTS C++binary reverse analysis particularly difficult.At present,the existing research work does not fully consider compilation optimization,resulting in a low recognition rate on recovering classes and class relationships under compiler optimization,and it is difficult to iden-tify complex relationships such as virtual inheritance.Second,the recognition algorithm has low efficiency and cannot meet the reverse analysis of large-scale software.This paper conducts research on the identification technology of classes and their inheritance in C++binary under compiler optimization,and makes achievements in three aspects.First,using the inter-procedural static taint analysis to extract the object memory layout from the C++binary,effectively resisting the impact of compiler optimization(inline constructors);second,introducing four heuristic methods,which can recover lost information in C++binary files;third,an adaptive CFG(control flow graph)generation algorithm has been developed to greatly improve the efficiency with mini-mal loss.On this basis,a prototype system RECLASSIFY is implemented,which can effectively identify polymorphic classes and class relationships(including virtual inheritance)from C++binary.Experiments show that under both MSVC ABI and Itanium ABI,RECLASSIFY can identify most polymorphic class and recovery class relationships from the optimized binary in a short time.In a data set composed of 15 C++binaries in real software(O2 compiler optimization),the average recall rate of RECLASSIFY recovering polymorphic classes under MSVC ABI is 84.36%,while the average recall rate of most advanced solution OOAnalyzer is only 33.76%.In addition,compared with OOAnalyzer,the analysis efficiency of RECLASSIFY is improved by three orders of magnitude.
作者 杨晋 龚晓锐 吴炜 张伯伦 YANG Jin;GONG Xiaorui;WU Wei;ZHANG Bolun(Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100093,China;School of Cyber Security,University of Chinese Academy of Sciences,Beijing 100049,China)
出处 《信息安全学报》 CSCD 2024年第3期138-156,共19页 Journal of Cyber Security
基金 北京市科技计划网络空间攻防特殊技能人才培养及支撑平台建设课题(No.Z181100002718002)资助。
关键词 二进制分析 类继承关系恢复 静态污点分析 自适应CFG生成算法 binary analysis class inheritance recovery static taint analysis adaptive CFG generation algorithm
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部