摘要
Machine learning(ML)techniques and algorithms have been successfully and widely used in various areas including software engineering tasks.Like other software projects,bugs are also common in ML projects and libraries.In order to more deeply understand the features related to bug fixing in ML projects,we conduct an empirical study with 939 bugs from five ML projects by manually examining the bug categories,fixing patterns,fixing scale,fixing duration,and types of maintenance.The results show that(1)there are commonly seven types of bugs in ML programs;(2)twelve fixing patterns are typically used to fix the bugs in ML programs;(3)68.80%of the patches belong to micro-scale-fix and small-scale-fix;(4)66.77%of the bugs in ML programs can be fixed within one month;(5)45.90%of the bug fixes belong to corrective activity from the perspective of software maintenance.Moreover,we perform a questionnaire survey and send them to developers or users of ML projects to validate the results in our empirical study.The results of our empirical study are basically consistent with the feedback from developers.The findings from the empirical study provide useful guidance and insights for developers and users to effectively detect and fix bugs in MLprojects.
基金
This work was supported partially by the National Natural Science Foundation of China(Grant Nos.61872312,61972335,61472344,61611540347,61402396 and 61662021)
partially by the Open Funds of State Key Laboratory for Novel Software Technology of Nanjing University(KFKT2020B15 and KFKT2020B16)
partially by the Jiangsu“333”Project,partially by the Six Talent Peaks Project in Jiangsu Province(RJFW-053)
partially by the Natural Science Foundation of Jiangsu(BK20181353)
partially by the Yangzhou city-Yangzhou University Science and Technology Cooperation Fund Project(YZU201803),by the CERNET Innovation Project(NGII20180607)
partially by the Yangzhou University Top-level Talents Support Program(2019).