摘要
随着软件项目规模的增大与复杂性的增加,测试过程产生了大量的错误报告,其中重复的错误报告广泛存在.重复错误报告的存在,降低了开发人员修复错误的效率.重复错误报告预测可有效地避免重复错误报告的产生,是近年来的热门研究方向之一,但其效率及准确率有待提高.为此,提出一种基于语义扩展连续查询的重复错误报告预测方法,通过构建基于主题模型的错误报告索引词库,对查询词序列进行语义扩展,采用基于连续查询的错误报告检索算法,在缩小索引空间的同时,提升了预测准确率与效率.实验表明,相较于传统重复错误报告预测方法,该方法减小了50%以上的错误报告索引空间,最高提升了33.6%的预测效果,且缩短了41%–73%的检索时间.
With the increase in scale and complexity of software projects,a large number of bug reports are generated during the testing process,among which duplicate bug reports are widely present,reducing the efficiency of developers in fixing bugs.The prediction of duplicate bug report has become one of the popular research fields in recent years,and its efficiency and accuracy need to be improved.Therefore,this study puts forward a prediction method of duplicate bug reports based on semantic extension and continuous queries.Through the construction of a bug report index thesaurus based on the theme model,the semantic extension of query sequences is conducted.Then,the bug report retrieval algorithm based on the continuous query is adopted to narrow the index space and improve the prediction accuracy and efficiency.Experimental results show that compared with the traditional prediction method of duplicate bug reports,the proposed method reduces the index space of bug reports by more than 50%,improves the prediction effect by up to 33.6%,and shortens the retrieval time by 41%–73%.
作者
张骞月
赵瑞莲
王微微
ZHANG Qian-Yue;ZHAO Rui-Lian;WANG Wei-Wei(College of Information Science and Technology,Beijing University of Chemical Technology,Beijing 100029,China)
出处
《计算机系统应用》
2022年第2期31-39,共9页
Computer Systems & Applications
基金
国家自然科学基金(62077003,61872026)。
关键词
重复错误报告
语义扩展
连续查询
主题索引词库
检索算法
预测方法
duplicate bug report
semantic extension
continuous query
subject index thesaurus
retrieval algorithm
prediction method