摘要
恶意软件数量持续增长对网络空间安全造成严重威胁.大量采用规避分析方法进行混淆的样本使得基于单一特征的恶意软件分析方法难以准确检测或分类恶意软件,虽然目前已有使用多特征的恶意软件分析方法,但没有充分利用不同模态特征之间的相互关系.为了解决上述问题,本文提出一种利用低秩多模态融合的恶意软件分类方法.首先提取汇编函数的语义与调用关系、可视化灰度图和熵值分布分别输入对应模态的子模型,然后通过低秩多模态融合方法进行多模态特征融合.该方法在利用外积表示模态之间相互关系的基础上进行优化,将融合过程中的权重矩阵分解为低秩权重因子,避免计算高维张量来降低计算复杂性.实验表明本文方法在恶意软件分类上有较好的表现.
The continuous growth of the number of malwares poses a serious threat to cyber security.The large number of samples obfuscated using analysis evasion methods makes single-feature-based malware analysis methods difficult to detect or classify malware.Although there are existing malware analysis methods that use multiple features,the interrelationship between features of multiple modalities is not fully exploited.To solve the above problems,this paper proposed a malware classification method using Low-rank Multimodal Fusion.The method first extracts three types of features:semantics and function call graph of assembly code,visual grayscale image,and entropy distribution,then inputs them into the corresponding sub-models.The learned modal vectors are then fused through the Low-rank Multimodal Fusion method,which is optimized based on using the outer product to represent the interrelationship between modalities,decomposing the weight matrix in the fusion process into low-rank weight factors and avoiding the computation of high-dimensional tensor to reduce computational complexity.The experiments show that the proposed method achieves better performance in malware classification.
作者
王春东
刘驰
WANG Chundong;LIU Chi(School of Computer Science and Engineering,Tianjin University of Technology,Tianjin 300384,China;National Engineering Laboratory for Computer Virus Prevention and Control Technology,Tianjin 300384,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2024年第12期3008-3015,共8页
Journal of Chinese Computer Systems
基金
国家自然科学基金联合基金项目(U1536122)资助
天津市科委重大专项项目(15ZXDSGX00030)资助。
关键词
恶意软件分类
低秩多模态融合
汇编代码
灰度图
熵值分布
malware classification
low-rank multimodal fusion
assembly code
gray-scale image
entropy distribution