摘要
软件自承认技术债(Self-admitted Technical Debt,SATD)由程序开发人员写入项目的源代码注释中,是开发人员为追求短期效益而刻意留下软件缺陷的说明,大量的SATD将不利于软件维护。近年来,越来越多的学者致力于软件SATD识别的研究,并提出了不同的识别方法,如基于自然语言处理或文本挖掘等检测方法。然而,大多数研究结果依赖于现有的词库或手工提取的特征,不仅耗费了大量的时间,而且增加了计算复杂度,识别结果并不理想。基于此,提出了一种基于双向门控循环单元(Gate Recurrent Unit,GRU)和注意力机制的软件自承认技术债识别方法,通过Word2vec中的Skip-gram模型获取词向量,构建双向GRU网络获取高级特征,并利用注意力机制自动发现对SATD分类起到关键作用的词,从而捕获最重要的语义信息。实验结果表明,本文方法在精确率、召回率和F1-score上均有较优表现,能够有效地识别软件SATD,避免了传统任务中复杂的特征工程。
Software self-admitted technical debt(SATD)is written into the source code comments of the project by developers who leave a note admitting incurring intentionally for short-term benefits,and a large amount of SATD will be dangerous to software maintenance.In recent years,more scholars focus on the research of software SATD recognition and propose different identification approaches,such as SATD detection based on natural language processing or text mining.However,the identification results of most previous studies are not very well due to the existing thesaurus or manually extracted features,which not only consumes a lot of time,but also increases computational complexity.Therefore,a software SATD identification approach based on bidirectional gated recurrent unit(GRU)and attention mechanism is proposed.The word vector is obtained first through the Skip-gram model,and the bidirectional GRU network is constructed to obtain the high-level features.Finally,the attention mechanism is used to automatically discover words that play a key role in SATD identification,and the most important semantic information can be captured.Experimental results show that the proposed approach has excellent performance in precision,recall and F1-score.It can effectively identify software SATD and avoid complex feature engineering in traditional tasks.
作者
熊罗庚
郑尚
邹海涛
于化龙
高尚
XIONG Luo-geng;ZHENG Shang;ZOU Hai-tao;YU Hua-long;GAO Shang(School of Computer,Jiangsu University of Science and Technology,Zhenjiang,Jiangsu 212100,China)
出处
《计算机科学》
CSCD
北大核心
2022年第7期212-219,共8页
Computer Science
基金
江苏省高等学校自然科学研究面上基金(18JBK520011)
江苏省镇江市重点研发计划(社会发展)项目(SH2019021)
江苏省自然科学基金面上项目(BK20191457)。