摘要
为了减少代码冗余,改善程序结构,提出一种新的基于抽象语法的代码克隆识别方法,归纳出常见的代码克隆形式并给出相应的重构技术.用二叉树表示源程序的抽象语法(BAST),逐条判断各语句BAST子树的同构性,识别出相似的语句序列作为克隆序列;根据子树同构识别一元克隆类,然后通过克隆类的连接操作,逐步识别二元及任意元数的克隆类.实验分析了多个开源软件,识别出了其中的克隆序列以及克隆类,从中归纳出4种常见的代码克隆,其基本特征分别为:相同的程序点访问同类对象的不同属性、部分变量名不同、针对不同的数据类型实施相同的操作、修改克隆区域外定义的变量,并对这4种代码有效地实施了重构.
In order to reduce code redundancy and improve program structure, a novel approach based on abstract syntax is presented to detect clone code, and several kinds of code clones that occur frequently in programs are outlined. Corresponding refactoring techniques are also presented. Abstract syntax of the analyzed program is represented as binary tree (BAST). Isomorphism of sub- BAST is judged statement by statement. Similar statement sequences are detected as clone sequences. 1 -tuple clone classes are detected according to isomorphism of sub-BAST. By the join operation of clone classes 2-tuple and other clone classes can be achieved stage by stage. The experi- ment analyzes several open source projects, and clone sequences and classes are detected. Four kinds of code clones are induced from the detection result which have the following characters respectively : accessing different properties of the same class's objects at the same program point, modifying some variable names, applying the same operation to different types, modifying variables defined outside the clone area. All the four kinds of clone codes are refactored successfully.
出处
《东南大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2008年第2期228-232,共5页
Journal of Southeast University:Natural Science Edition
基金
国家杰出青年科学基金资助项目(60425206)
国家自然科学基金资助项目(60503020)
江苏省自然科学基金资助项目(BK2006094)
江苏省高技术研究资助项目(BG2005032).
关键词
代码克隆
克隆识别
克隆类
软件维护
code clone
clone detection
clone class
software maintenance