期刊文献+

利用贝叶斯原理在隐私保护数据上进行分类的方法 被引量:1

A Classification Method for Privacy-Preserved Data Using Bayesian Rule
下载PDF
导出
摘要 针对可还原数据扰动(retrievable general additive data perturbation,RGADP)算法在保护数据库隐私时会影响数据挖掘结果的问题,提出一种利用贝叶斯原理在扰动数据上进行分类的方法。该方法分析RGADP算法过程,利用贝叶斯原理,根据扰动数据推算原始数据的概率分布,用估算的概率分布重构数据,并对重构数据进行分类以提高分类的正确性。实验结果表明:该方法估算出的概率分布与原始数据概率分布接近,且重构数据的分类正确率相比扰动数据而言平均可提高4%以上,其更接近原始数据的分类正确率,从而有效地降低了扰动算法对数据分类的影响;该方法的运行时间与数据量和数据分组数成正比,重构10 000条数据的运行时间在200ms以内,因此该方法也具有较高的效率。 A classification method for perturbed data using the Bayesian rule is presented to solve the problem that the result of data mining is affected when the retrievable general additive data perturbation(RGADP)algorithm is used to preserve privacy in database.The process of RGADP algorithm is analyzed,and the Bayesian rule is used to estimate the probability distribution of original data from the perturbed data.Then,new data are reconstructed from the estimated probability distribution and are classified to increase the accuracy of classification.Experimental results show that the probability distribution estimated by the proposed method is close to the original probability distribution.Comparison with the classification accuracy of perturbed data shows that the classification accuracy of the reconstructed data increases by more than 4% in average,and is closer to the original classification accuracy.Thus,the method can effectively reduce the effect of the perturbation algorithm on classification.Moreover,the running time of the method is proportional to the amount of data and the number of groups.The method costs less than 200 ms to reconstruct 10 thousands data,and has a high efficiency.
出处 《西安交通大学学报》 EI CAS CSCD 北大核心 2015年第4期46-52,共7页 Journal of Xi'an Jiaotong University
基金 高等学校博士学科点专项科研基金资助项目(20120201110013) 国家自然科学基金资助项目(61172090 61472316) 中央高校基本科研业务费资助项目(XKJC2014008) 陕西省科技统筹创新工程资助项目(2013SZS16)
关键词 隐私保护 数据扰动 贝叶斯原理 分类 privacy-preservation data perturbation Bayesian rule classification
  • 相关文献

参考文献16

  • 1周水庚,李丰,陶宇飞,肖小奎.面向数据库应用的隐私保护研究综述[J].计算机学报,2009,32(5):847-861. 被引量:221
  • 2SWEENEY L. K-anonymity: a model for protecting privacy[J]. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2002, 10 (5) : 557-570.
  • 3WANG S L, TSAI Z Z, TING I H, et al. Kanony- mous path privacy on social graphs [J]. Journal of In- telligent and Fuzzy Systems, 2014, 26 (3): 1191- 1199.
  • 4LI Jin, WANG Qian, WANG Cong, et al. Fuzzy key- word search over encrypted data in cloud computing [C]//Proceedings of the 2010 IEEE International Con- ference on Computer Communications. Piscataway, NJ, USA: IEEE, 2010: 1-5.
  • 5KANTARCIOGLU M, CLIFTON C. Privacy-preser- ving distributed mining of association rules on horizon- tally partitioned data [J]. IEEE Transactions on Knowledge and Data Engineering, 2004, 16(9): 1026- 1037.
  • 6VAIDYA J, CLIFTON C. Privacy preserving k-means clustering over vertically partitioned data [ C] // Pro- ceedings of the 9th ACM SIGKDD International Con- ference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2003: 206-215.
  • 7张鹏,唐世渭.朴素贝叶斯分类中的隐私保护方法研究[J].计算机学报,2007,30(8):1267-1276. 被引量:19
  • 8BAGHEL R, DUTTA M. Privacy preserving classifi- cation by using modified C4.5 [C] // Proceedings of the IEEE International Conference on Contemporary Com- puting. Piscataway, NJ, USA: IEEE, 2013: 124-129.
  • 9MUIALIDHAI 1, PARSA R, 5AIAIHY R. A general additive data perturbation method for database security [J]. Management Science, 1999, 45 (10) : 1399-1415.
  • 10MURALIDHAR K, SARATHY R. An enhanced data perturbation approach for small data sets [J]. Decision Sciences, 2005, 36(3): 513-529.

二级参考文献87

共引文献239

同被引文献4

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部