期刊文献+

DP-Share: Privacy-Preserving Software Defect Prediction Model Sharing Through Differential Privacy 被引量:2

原文传递
导出
摘要 In current software defect prediction (SDP) research, most previous empirical studies only use datasets provided by PROMISE repository and this may cause a threat to the external validity of previous empirical results. Instead of SDP dataset sharing, SDP model sharing is a potential solution to alleviate this problem and can encourage researchers in the research community and practitioners in the industrial community to share more models. However, directly sharing models may result in privacy disclosure, such as model inversion attack. To the best of our knowledge, we are the first to apply differential privacy (DP) to privacy-preserving SDP model sharing and then propose a novel method DP-Share, since DP mechanisms can prevent this attack when the privacy budget is carefully selected. In particular, DP-Share first performs data preprocessing for the dataset, such as over-sampling for minority instances (i.e., defective modules) and conducting discretization for continuous features to optimize privacy budget allocation. Then, it uses a novel sampling strategy to create a set of training sets. Finally it constructs decision trees based on these training sets and these decision trees can form a random forest (i.e., model). The last phase of DP-Share uses Laplace and exponential mechanisms to satisfy the requirements of DP. In our empirical studies, we choose nine experimental subjects from real software projects. Then, we use AUC (area under ROC curve) as the performance measure and holdout as our model validation technique. After privacy and utility analysis, we find that DP-Share can achieve better performance than a baseline method DF-Enhance in most cases when using the same privacy budget. Moreover, we also provide guidelines to effectively use our proposed method. Our work attempts to fill the research gap in terms of differential privacy for SDP, which can encourage researchers and practitioners to share more SDP models and then effectively advance the state of the art of SDP.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2019年第5期1020-1038,共19页 计算机科学技术学报(英文版)
基金 partially supported by the National Natural Science Foundation of China under Grant Nos. 61702041 and 61872263 the Open Project of State Key Laboratory for Novel Software Technology at Nanjing University under Grant No. KFKT2019B14 the Science and Technology Project of Beijing Municipal Education Commission under Grant No. KM201811232016 the Nantong Application Research Plan under Grant No. JC2018134 Jiangsu Government Scholarship for Overseas Studies.
  • 相关文献

参考文献2

二级参考文献26

  • 1Gao K, Khoshgoftaar T. Software defect prediction for high- dimensional and class-imbalanced data. In Proc. the 23rd SEKE, July 2011, pp.89-94.
  • 2Zheng J. Cost-sensitive boosting neural networks for soft- ware defect prediction. Expert Syst. Appl., 2010, 37(6): 4537-4543.
  • 3Wang S, Yao X. Using class imbalance learning for soft- ware defect prediction. IEEE Trans. Reliab., 2013, 62(2): 434-443.
  • 4Turhan B, Tosun Mlslrh A, Bener A. Empirical evaluation of the effects of mixed project data on learning defect pre- dictors. Inf. Softw. Technol., 2013, 55(6): 1101-1118.
  • 5Turhan B, Menzies T, Bener A B, Di Stefano J. On the rel- ative value of cross-company and within-company data for defect prediction. Empir. Softw. Eng., 2009, 14(5): 540-578.
  • 6Wilcoxon F. Individual comparisons by ranking methods. Biometrics Bull., 1945, 1(6): 80-83.
  • 7Vargha A, Delaney H D. A critique and improvement of the "CL" common language effect size statistics of McGraw and Wong. J. Educ. Behav. Stat., 2000, 25(2): 101-132.
  • 8Hall T, Beecham S, Bowes D, Gray D, Counsell S. A sys- tematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Eng., 2012, 38(6): 1276-1304.
  • 9Arisholm E, Briand L C, Johannessen E B. A system- atic and comprehensive investigation of methods to build and evaluate fault prediction models. J. Syst. Softw., 2010, 83(1): 2-17.
  • 10D'Ambros M, Lanza M, Robbes R. Evaluating defect pre- diction approaches: A benchmark and an extensive com- parison. Ernpir. Softw. Eng., 2012, 17(4/5): 531-577.

共引文献13

同被引文献18

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部