摘要
针对现有代码混淆仅限于某一特定编程语言或某一平台,并不具有广泛性和通用性,以及控制流混淆和数据混淆会引入额外开销的问题,提出一种基于底层虚拟机(LLVM)的标识符混淆方法。该方法实现了4种标识符混淆算法,包括随机标识符算法、重载归纳算法、异常标识符算法以及高频词替换算法,同时结合这些算法,设计新的混合混淆算法。所提混淆方法首先在前端编译得到的中间文件中候选出符合混淆条件的函数名,然后使用具体的混淆算法对这些函数名进行处理,最后使用具体的编译后端将混淆后的文件转换为二进制文件。基于LLVM的标识符混淆方法适用于LLVM支持的语言,不影响程序正常功能,且针对不同的编程语言,时间开销在20%内,空间开销几乎无增加;同时程序的平均混淆比率在77.5%,且相较于单一的替换算法和重载算法,提出的混合标识符算法理论分析上可以提供更强的隐蔽性。实验结果表明,所提方法具有性能开销小、隐蔽性强、通用性广的特点。
Most of the existing code obfuscation solutions are limited to a specific programming language or a platform,which are not widespread and general.Moreover,control flow obfuscation and data obfuscation introduce additional overhead.Aiming at the above problems,an identifier obfuscation method was proposed based on Low Level Virtual Machine(LLVM).Four identifier obfuscation algorithms were implemented in the method,including random identifier algorithm,overload induction algorithm,abnormal identifier algorithm,and high-frequency word replacement algorithm.At the same time,a new hybrid obfuscation algorithm was designed by combining these algorithms.In the proposed method,firstly,in the intermediate files compiled by the front-ends,the function names,which met the obfuscation criteria,were selected.Secondly,these function names were processed by using specific obfuscation algorithms.Finally,the obfuscated files were transformed into binary files by using specific compilation back-ends.The identifier obfuscation method based on LLVM is suitable for the languages supported by LLVM and does not affect the normal functions of the program.For different programming languages,the time overhead is within 20%and the space overhead hardly increases.At the same time,the average confusion ratio of the program is 77.5%,and compared with the single replacement algorithm and overload algorithm,the proposed mixed identifier algorithm can provide stronger concealment in theoretical analysis.Experimental results show that the proposed method has the characteristics of low-performance overhead,strong concealment,and wide versatility.
作者
田大江
李成扬
黄天波
文伟平
TIAN Dajiang;LI Chengyang;HUANG Tianbo;WEN Weiping(School of Software and Microelectronics,Peking University,Beijing 102600,China)
出处
《计算机应用》
CSCD
北大核心
2022年第8期2540-2547,共8页
journal of Computer Applications
基金
华为−北京大学校企合作项目(2020001763)。
关键词
软件保护
代码混淆
标识符混淆
底层虚拟机
混淆方法
software protection
code obfuscation
identifier obfuscation
Low Level Virtual Machine(LLVM)
obfuscation method