摘要
克隆代码研究与软件工程中的各类问题密切相关。现有的克隆代码稳定性研究主要集中于克隆代码与非克隆代码的比较以及不同克隆代码类型之间的比较,少有研究对克隆代码的稳定性与克隆群所分布的面向对象类进行相关分析。基于面向对象类的粒度进行了克隆群稳定性实证研究,设计了4项与克隆群稳定性相关的研究问题,围绕这些研究问题,将克隆群分为类内、类间和混合3组,并基于4种视角下的9个演化模式进行了克隆群稳定性的对比分析。首先,检测软件系统所有子版本中的克隆代码,识别并标注所有克隆代码片段所属的类信息;其次,基于克隆片段映射方法完成相邻版本间克隆群的演化映射和演化模式的识别与标注,并将映射和标注结果合并为克隆代码演化谱系;然后,在不同视角下,针对3组克隆群进行稳定性计算;最后,根据实验结果对比分析了3组克隆群的稳定性差异。在7款面向对象开源软件系统总共近7700个版本上进行的克隆群稳定性实验结果表明:约60%的类内克隆群的生命周期率达到50%及以上,类间克隆和混合克隆群的生命周期率达到50%及以上的占比均约为35%;类内克隆群发生变化的次数最少,类间克隆群发生合并、分枝和延迟修复演化模式的次数相对略多,混合克隆群发生片段减少、内容一致变化和不一致变化的次数最多。总体而言,类内克隆群的稳定性表现最佳,混合克隆群在演化中可能需要重点跟踪或优先重构。克隆代码稳定性分析方法及实验结论将为克隆代码的跟踪、维护以及重构等克隆管理相关软件活动提供有力的参考和支持。
Researching on clone code is closely related to various problems in software engineering.The existing researches and studies on stability of clone code mainly focus on comparisons between clone code and non-clone code,or between different types of clone code.Rare studies consider the object-oriented classes in which clone sets distribute.This paper presents a comprehensive empirical study on stability of clone sets based on object-oriented class granularity.This paper frames four study problems about the stability of clone sets.Around these particular problems,all clone sets are categorized into three groups,intra-class clone sets,inter-class clone sets and hybrid-class clone sets.And stability of them is compared and analyzed by 9 evolution patterns from 4 perspectives during the process of software evolution.First of all,clone code fragments in all revisions of subject systems are detected and tagged with object-oriented classes where they distribute in.Next,clone sets between adjacent revisions are mapped based on mapping clone fragments,and evolution patterns of clone sets can be recognized and tagged.After that,clone genealogy is constructed by combing the results of mapping relations and evolution patterns,and then stability of three groups of clone sets is calculated from different perspectives.Eventually,differences of three groups are compared and analyzed.According to the experimental results on 7700 revisions of seven diverse object-oriented subject systems,about 60%of intra-class clone sets have a life cycle more than half of the total number of reversions,the percentage of inter-class clone sets and hybrid clone sets that have a life cycle rate of 50%or more are both close to 35%.Comparatively speaking,among three kinds of clone sets,the frequency of changes within intra-class clone sets is the lowest.Also,there is a bit more merging,branching and late propagation evolution patterns in inter-class clone sets.And the frequency of fragments deletions,consistent changes and inconsistent changes is the highest in hybrid-class clone sets.Overall,stability of intra-class clone sets is the best,hybrid-class clone sets should be given a higher priority to tracing or refactoring in the process of software evolution.The clone code stability analysis methods and findings from this work will provide a strong reference and support for clone code maintenance,tracking,refactoring and other cloning management related software activities.
作者
张久杰
陈超
聂宏轩
夏玉芹
张丽萍
马占飞
ZHANG Jiu-jie;CHEN Chao;NIE Hong-xuan;XIA Yu-qin;ZHANG Li-ping;MA Zhan-fei(Department of Computer Science&Technology,Baotou Teachers’College,Baotou,Inner Mongolia 014030,China;School of Computer Science&Technology,Inner Mongolia Normal University,Hohhot 010022,China)
出处
《计算机科学》
CSCD
北大核心
2021年第5期75-85,共11页
Computer Science
基金
国家自然科学基金(61762071,61462071)
内蒙古自治区自然科学基金(2014MS0613,2015MS0606,2016MS0614,2019MS06037)
2020年度重庆市出版专项资金资助项目。
关键词
软件演化
软件维护
克隆代码
稳定性
类内克隆
类间克隆
混合克隆
Software evolution
Software maintenance
Clone code
Stability
Intra-class-clone
Inter-class clone
Hybrid-class clone