摘要
为寻找系统中的模式实例,现有设计模式识别方法多直接将原系统与设计模式进行匹配,从而引入大量的假阳性实例或假阴性实例,导致召回率和精确率降低。为此,在前期研究的基础上,进一步探索基于相似度评分与二级子系统的设计模式识别方法。根据从系统中提取的相关信息,将系统和设计模式表示为有向图/矩阵形式。将待识别系统划分为若干个子系统,并进一步拆解和重组为类个数与待识别模式中角色个数相等的二级子系统。利用相似度评分算法判断二级子系统是否为模式实例,同时对获取的实例做进一步处理,以得到最终的模式实例。在JHotDraw、JRefactory和JUnit三个开源项目上的实验结果表明,该方法的平均召回率分别达到96.7%、91.7%和100%,平均精确率分别达到94.9%、91.5%和92.5%,而CPU时间花费分别为5408 ms、22280 ms和3284 ms,在保持高召回率的前提下提升了精确率和时间效率。
Most existing design pattern methods directly match the pattern of the original system and the design patterns to identify pattern instances in a system.This introduces numerous false positive or false negative instances,which limits their recall and precision.Therefore,based on previous studies,this study further investigates the design pattern detection method based on similarity scoring and secondary subsystems.According to the relevant information extracted from the system,the system and design patterns are expressed in the form of directed graph/matrix.Subsequently,the system to be identified is divided into several subsystems.The subsystems are further disassembled and reorganized into secondary subsystems with the same number of classes and roles in the pattern to be identified.The similarity scoring algorithm is used to assess whether a secondary subsystem is a pattern instance,and the obtained instances are further processed obtain the final pattern instances.Experiments are performed on the JHotDraw,JRefactory,and JUnit open-source projects,where average recall rate of 96.7%,91.7%and 100%,average precision of 94.9%,91.5%,and 92.5%,and CPU time costs of 5408 ms,22280 ms,and 3284ms,respectively,are obtained.The result shows that the precision and time efficiency are improved while a high recall rate is maintained.
作者
王雷
王文发
宋慧娜
张帅
WANG Lei;WANG Wenfa;SONG Huina;ZHANG Shuai(College of Mathematics and Computer Science,Yan’an University,Yan’an,Shaanxi 716000,China;Shaanxi Key Laboratory of Intelligent Processing for Energy Big Data,Yan’an,Shaanxi 716000,China;Joint Laboratory of Yan’an University and Shanghai Pactera(Big Data Application Development Direction),Yan’an,Shaanxi 716000,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2023年第1期210-222,共13页
Computer Engineering
基金
国家自然科学基金(62041212)
陕西省教育厅科研计划项目(21JK0988)
延安大学博士科学研究启动项目(YDBK2019-51)
陕西省能源大数据智能处理省市共建重点实验室开放基金(IPBED22)。
关键词
设计模式识别
精确率
有向图
二级子系统
软件逆向工程
design pattern recognition
precision
directed graph
secondary subsystem
software reverse engineering