摘要
针对数字犯罪事件调查,在复杂、异构及底层的海量证据数据中恶意代码片段识别难的问题,通过分析TensorFlow深度学习模型结构及其特性,提出一种基于TensorFlow的恶意代码片段检测算法框架;通过分析深度学习算法训练流程及其机制,提出一种基于反向梯度训练的算法;为解决不同设备、不同文件系统的证据源中恶意代码片段特征提取问题,提出一种基于存储介质底层的二进制特征预处理算法;为进行反向传播训练,设计并实现了一个代码片段数据集制作算法。实验结果表明,基于TensorFlow的恶意代码片段检测算法针对不同存储介质以及证据存储容器中恶意代码片段的自动取证检测,综合评价指标F1达到0.922,并且和CloudStrike、Comodo、FireEye等杀毒引擎相比,该算法在处理底层代码片段数据方面具有绝对优势。
In order to auto detect the underlying malicious code fragments in complex,heterogeneous and massive evidence data about digital forensic investigation, a framework for malicious code fragment detecting algorithm based on TensorFlow was proposed by analyzing TensorFlow model and its characteristics. Back-propagation training algorithm was designed through the training progress of deep learning. The underlying binary feature pre-processing algorithm of malicious code fragment was discussed and proposed to address the problem about different devices and heterogeneous evidence sources from storage media and such as AFF forensic containers. An algorithm which used to generate data set about code fragments was designed and implemented. The experimental results show that the comprehensive evaluation index F1 of the method can reach 0.922, and compared with CloudStrike, Comodo, FireEye antivirus engines, the algorithm has obvious advantage in dealing with the underlying code fragment data from heterogeneous storage media.
作者
李炳龙
佟金龙
张宇
孙怡峰
王清贤
常朝稳
LI Binglong;TONG Jinlong;ZHANG Yu;SUN Yifeng;WANG Qingxian;CHANG Chaowen(College of Cryptographic Engineering,Information Engineering University,Zhengzhou 450001,China)
出处
《网络与信息安全学报》
2021年第4期154-163,共10页
Chinese Journal of Network and Information Security
基金
国家自然科学基金(60903220)。