摘要
软件开发人员在编程过程中需要使用大量的应用程序接口(API),但是API文档自身可能存在不完整、过时等情况,导致对其理解和使用出现困难。通常基于序列模式挖掘API调用模式的方法(例如UP-Miner等)针对的是单一的数据来源(即用户源程序),在使用过程中若阈值设置较高,则挖掘出的API调用模式完整性会降低,甚至会丢失一些重要的API调用模式。为此,文中提出一种多源驱动的API调用模式挖掘方法,将用户代码和问答网站(如Stack Overflow)上的专家示例代码相结合,采用分类和聚类的方法挖掘出较少的API调用模式。与UP-Miner等其他工具的对比实验结果表明,所提方法在召回率以及准确率上有较大的提升。
Software developers usually need to use a large number of APIs(application program interface)in the programming process,but the API document itself may be incomplete and outdated,which makes it difficult to understand and use the API.The method of mining API call patterns based on sequential patterns(such as UP-Miner)is aimed at using the single data source(i.e.user source program).If the threshold value is set higher in the use process,the integrity of the discovered API call patterns will be reduced,or even some important API call patterns will be lost.A multisource driven API call pattern mining method is proposed,which combines user code with expert example code on Q&A websites(such as Stack Overflow),and can mine fewer API call patterns by means of classification and clustering methods.In comparison with other tools such as UP-Miner,the experimental results show that the proposed method has a greater improvement in recall and precision than other methods.
作者
杨超逸
钟林辉
莫俊杰
卢腾骏
高荣锦
阮书鹤
祝艳霞
YANG Chaoyi;ZHONG Linhui;MO Junjie;LU Tengjun;GAO Rongjin;RUAN Shuhe;ZHU Yanxia(School of Computer and Information Engineering,Jiangxi Normal University,Nanchang 330022,China;School of VR Modern Industry,Jiangxi University of Finance and Economics,Nanchang 330032,China)
出处
《现代电子技术》
2023年第16期75-80,共6页
Modern Electronics Technique
基金
国家自然科学基金项目(62062039)
国家自然科学基金项目(61966017)
江西省自然科学基金项目(20212BAB202017)
江西省自然科学基金项目(20224BAB202013)
江西省自然科学基金项目(20212BAB202018)
校教改课题(JXSDJG2044)