摘要
现有基于API特征的Android恶意软件分析技术普遍采用API频繁调用序列或者API调用子图作为特征对恶意家族进行聚类,固化了恶意软件的特征,难以准确地进行恶意软件家族分类。API调用图为有向图,本文利用有向图节点依赖性的特点,采用拓扑排序对API调用图进行排序。随机提取局部特征后节点对前节点具有依赖性,这使得特征拥有随机性、有效性。结合卷积神经网络提出一种新的Android恶意软件家族分类技术。通过静态分析提取Android应用的API调用图,同时建立一个API数据库对API进行标记;将API调用图进行排序并转换为RGB图像,使用本文提出的卷积神经网络模型CallN提取图像特征进行分类。本文选择Drebin数据集中恶意软件家族规模前20的恶意家族的恶意软件进行家族分类,家族分类准确率达到99.93%。实验结果表明,本文提出的方法能有效地对恶意软件家族进行分类。
In practical application,the situation,in which it requires a listing of the data in the order of the size of the keywords without changing the order of the original data,is an often-met case.The original classic sorting algorithm cannot be used directed to solve this kind of problem.This paper,by researching into the selective sorting algorithm,puts forward an algorithm on the basis of sorting the data without changing the positions of the data.It also gives a dynamic demonstration of the realization procedure of this algorithm by applying the C language programming.Existing Android malware analysis technology based on API characteristics generally uses API frequent call sequences or API call subgraphs as features to cluster malicious families,these methods solidify the characteristics of malware,and make it difficult to accurately classify malware families.The API call graph is a directed graph which has the characteristics of the directed graph node dependency,thus this paper sort the API call graph using topological sorting.After randomly extracting local features,the nodes of features have dependence on the previous nodes,which makes the features have randomness and effectiveness.Combined with the convolutional neural network,a new classification technology of Android malware family is proposed.First,we extract the API call graph of the Android application through static analysis and establish an API database to mark the API at the same time.Then,we sort the API call graph and change it into RGB image,using the convolutional neural network model CallN proposed in this paper to extract image features for classification.In this paper,we choose the top 20 malware families in the Drebin dataset to classify.The accuracy rate of the family classification is 99.93%.The experimental results show that the method proposed in this paper can effectively classify malware families.
作者
刘易
叶凯
LIU Yi;YE Kai(School of Cyber Science and Engineering,Sichuan University,Chengdu 610065;Zongbei Middle School,Chengdu 610041)
出处
《现代计算机》
2021年第13期26-31,37,共7页
Modern Computer