一种执行轨迹驱动的移动应用功能分类方法

A Trace-Driven Approach to Mobile App Functionality Classification

下载PDF

导出

摘要程序理解对于诸如遗留系统重构和恶意软件检测等多类场景具有重要作用.移动应用功能分类旨在通过分析目标移动应用的运行时行为来识别其主要功能.由于运行环境的动态性和开发框架的差异性,移动应用行为模式普遍呈现出较高的复杂性,这给移动应用功能分类带来了挑战.本文致力于通过分析移动应用的执行轨迹实现对其功能的自动分类.在形式化定义移动应用功能分类问题的基础上,本文提出了一个系统性的解决方案设计框架RaT(Run-and-Tell)以指导执行轨迹驱动的移动应用功能分类解决方案的设计.在RaT框架的指导下,本文提出了2种分别基于执行轨迹统计特征和语义特征的行为表征方法.然后,将所生成的2类行为表征与4种基于神经网络(即MLP、FCN、ResNet及LSTM)的移动应用功能分类器相结合构造了8种移动应用功能分类解决方案.此外,通过利用程序插桩技术,本文采集了来自Google Play应用商店3类安卓应用类别涵盖13种不同功能的17个安卓应用程序总计876条执行轨迹以构建实验数据集.实验结果表明,采用执行轨迹语义特征行为表征的RaT框架解决方案在实验数据集上达到了73.2%的类间平均分类准确率,其性能明显优于基线方法. Program comprehension assists in many scenarios such as legacy system re-engineering,malware detection,etc.Mobile app functionality classification aims to realize the functionality of a mobile app by analyzing its runtime behavior.Due to the dynamic runtime environment and various development frameworks,the mobile app behavior pattern usually presents great complexity which brings the challenge for its functionality classification.In this paper,we focus on the analysis of execution traces of mobile apps to facilitate the automatic classification of their functionalities.Based on the formulation of mobile app functionality classification,we proposed a systematic framework named RaT(Run-and-Tell)to guide the design of trace-driven mobile app functionality classification.Guided by RaT,we introduced two behavior representation methods based on statistical characteristics and semantic features extracted from execution traces,respectively.Afterward,by integrating 2 kinds of behavior representations with 4 types of mobile app functionality classifiers based on neural networks(i.e.MLP,FCN,ResNet,and LSTM),8 different solutions are implemented for mobile app functionality classification.Furthermore,by leveraging the program instrumentation technique,we collected 876 execution traces of 17 Android apps of 3 categories covering 13 different functionalities from Google Play to build the dataset for evaluation.Experimental results show that,by integrating semantics-based representations,solutions based on the RaT framework achieve 73.2%inter-category classification accuracy on average on the collected dataset,which significantly outperforms the baselines.

作者马超李俊彤曹建农蔡华谦吴黎兵石小川 MA Chao;LI Chun-Tung;CAO Jian-Nong;CAI Hua-Qian;WU Li-Bing;SHI Xiao-Chuan(School of Cyber Science and Engineering,Wuhan University,Wuhan 430072;Shenzhen Research Institute,The Hong Kong Polytechnic University,Shenzhen,Guangdong 518057;Department of Computing,The Hong Kong Polytechnic University,Hong Kong 000000;School of Electronics Engineering and Computer Science,Peking University,Beijing 100871)

机构地区武汉大学国家网络安全学院香港理工大学深圳研究院香港理工大学电子计算学系北京大学信息科学技术学院

出处《计算机学报》 EI CAS CSCD 北大核心 2022年第9期1997-2013,共17页 Chinese Journal of Computers

基金湖北省重点研发计划(2021BAA039) 广东省重点领域研发计划(2020B010164002)资助.

关键词程序理解移动应用功能分类执行轨迹行为表征神经网络 program comprehension mobile app functionality classification execution trace behavior representation neural network

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1Su Zhang,Hua-Qian Cai,Yun Ma,Tian-Yue Fan,Ying Zhang,Gang Huang.SmartPipe: Towards Interoperability of Industrial Applications via Computational Reflection[J].Journal of Computer Science & Technology,2020,35(1):161-178. 被引量：2

二级参考文献1

1HUANGGang MEIHong YANGFuqing.Runtime software architecture based on reflective middleware[J].Science in China(Series F),2004,47(5):555-576. 被引量：18

共引文献1

1王毅,陈迎仁,陈星,林兵,马郓.基于计算反射的Android应用程序接口自动生成方法[J].计算机科学,2022,49(12):136-145. 被引量：1

1周盛,刘元霄.云原生环境下的安全风险及防护策略研究[J].保密科学技术,2022(7):5-13. 被引量：3
2潘胜星,唐雅娟.基于FCN-CRF的医疗命名实体识别[J].电子设计工程,2022,30(17):60-63. 被引量：1
3赵骎.“双减”背景下基于大数据的作业改进路径探析[J].教学月刊（中学版）（教学管理）,2022(7):35-40.
4唐谨丁.核心素养背景下的高中生物拓展型校本课程开发——以“发酵食品的制作”为例[J].中学生物学,2022,38(7):79-80. 被引量：2
5施玲玲.电子发票在出版业的实践应用[J].中国农业会计,2022(7):84-85.
6吕艳芳,方祁麟,李萌,杜幸凯,陈琪,戴燕云.基于NB-IoT的低功耗智能地下井监测系统[J].软件工程与应用,2022,11(4):701-711.
7董永亨,李淑娟,张倩,李鹏阳,李旗,贾祯,李言.基于铣削系统动力学响应的球头铣刀铣削表面形貌建模[J].兵工学报,2022,43(8):1977-1989. 被引量：1
8张晶扬.如何改善镁下游企业低迷态势[J].中国有色金属,2022(17):48-49.
9苏耘.基于深度学习的时间序列分类方法综述[J].电子技术与软件工程,2022(14):259-262. 被引量：3
10封二英,程冬玲,张宇敬.基于数据集模型训练的网购评论情感倾向性技术分析[J].科技创新与应用,2022,12(25):40-42.

计算机学报

2022年第9期

浏览历史

内容加载中请稍等...

一种执行轨迹驱动的移动应用功能分类方法

参考文献1

二级参考文献1

共引文献1

相关作者

相关机构

相关主题

浏览历史