期刊文献+

基于交叉过采样的软件自承认技术债识别方法 被引量:2

Software self-admitted technical debt identification approach based on cross oversampling
下载PDF
导出
摘要 软件自承认技术债是描述开发人员追求项目短期利益而有意实施的技术折中.前人工作表明,根据代码注释能够构建分类器,并用于识别自承认技术债.然而绝大多数分类方法未能考虑代码注释中较少自承认技术债所造成的类别不平衡问题.即使考虑,已有方法也缺乏理想效果.文中提出基于交叉过采样的方法,即首先将技术债数据切分成短文本池,继而在短文本池中随机选择短文本进行拼接来生成新的技术债样本,这种做法有效扩展自承认技术债数据,成功解决了文本数据的类别不平衡问题.此外,采用词向量空间法来构建特征空间,利用信息增益这一特征选择方法来构建多个分类器以识别自承认技术债.实验结果表明文中工作在Precision、Recall和F1-score等3个性能量度上的结果普遍优于前人所提方法,能够帮助项目人员有效识别软件自承认技术债. Software self-admitted technical debt(SATD)refers to technical compromises that are made to gain short-term benefits of software project.Prior work on SATD has shown that the source code comments can be used to construct classifiers for the detection of SATD,but most current classification approaches do not consider the class imbalance problem caused by the less SATDs.There has been no effective solution to this problem.In this paper,we proposed a cross oversampling approach to expand the number of SATD.The SATD data are first cut into a short text pool,and then the new SATD can be generated by randomly integrating different short texts.Moreover,vector space model is used to construct feature space and information gain is used to select features for training multiple classifiers to recognize SATD.Experimental results show that our approach is better than previous ones in precision,recall and F1-score,and can help developers to identify software SATD effectively.
作者 黄城 徐克辉 郑尚 于化龙 HUANG Cheng;XU Kehui;ZHENG Shang;YU Hualong(School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang 212100, China;China Ship Research and Development Academy,Beijing 100101,China)
出处 《江苏科技大学学报(自然科学版)》 CAS 2020年第5期51-56,共6页 Journal of Jiangsu University of Science and Technology:Natural Science Edition
基金 国家自然科学基金资助项目(61305058,61572242) 江苏省自然科学基金资助项目(BK20130471) 中国博士后特别资助计划项目(2015T80481) 中国博士后科学基金资助项目(2013M540404) 江苏省博士后基金资助项目(1401037B)。
关键词 自承认技术债 类别不平衡 交叉过采样 特征选择 self-admitted technical debt class imbalance cross oversampling feature selection
  • 相关文献

同被引文献4

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部