One way to improve practicability of automatic program repair(APR) techniques is to build prediction models which can predict whether an application of a APR technique on a bug is effective or not. Existing predicti...One way to improve practicability of automatic program repair(APR) techniques is to build prediction models which can predict whether an application of a APR technique on a bug is effective or not. Existing prediction models have some limitations. First, the prediction models are built with hand crafted features which usually fail to capture the semantic characteristics of program repair task. Second, the performance of the prediction models is only evaluated on Genprog, a genetic-programming based APR technique. This paper develops prediction models, i.e., random forest prediction models for SPR, another kind of generate-and-validate APR technique, which can distinguish ineffective repair instances from effective repair instances. Rather than handcrafted features, we use features automatically learned by deep belief network(DBN) to train the prediction models. The empirical results show that compared to the baseline models, that is, all effective models, our proposed models can at least improve the F1 by 9% and AUC(area under the receiver operating characteristics curve) by 19%. At the same time, the prediction model using learned features at least outperforms the one using hand-crafted features in terms of F1 by 11%.展开更多
基金Supported by the National Natural Science Foundation of China(61603242)Opening Project of Collaborative Innovation Center for Economics Crime Investigation and Prevention Technology(JXJZXTCX-030)+1 种基金the Scientific Research Fund of Zhaoqing Univeristy(201734)Innovative Guidance Fund of Zhaoqing City(201704030409)
文摘One way to improve practicability of automatic program repair(APR) techniques is to build prediction models which can predict whether an application of a APR technique on a bug is effective or not. Existing prediction models have some limitations. First, the prediction models are built with hand crafted features which usually fail to capture the semantic characteristics of program repair task. Second, the performance of the prediction models is only evaluated on Genprog, a genetic-programming based APR technique. This paper develops prediction models, i.e., random forest prediction models for SPR, another kind of generate-and-validate APR technique, which can distinguish ineffective repair instances from effective repair instances. Rather than handcrafted features, we use features automatically learned by deep belief network(DBN) to train the prediction models. The empirical results show that compared to the baseline models, that is, all effective models, our proposed models can at least improve the F1 by 9% and AUC(area under the receiver operating characteristics curve) by 19%. At the same time, the prediction model using learned features at least outperforms the one using hand-crafted features in terms of F1 by 11%.