期刊文献+

基于距离与误差平方和的差分隐私K-means聚类算法 被引量:8

K-means Clustering Algorithm Based on Differential Privacy with Distance and Sum of Square Error
下载PDF
导出
摘要 K-means算法具有简单、快速、易于实现等优点,被广泛应用于数据挖掘领域,但在聚类过程中容易造成隐私泄露。差分隐私对隐私保护做了严格定义,且能够对隐私保护量化分析。为解决差分隐私保护中K-means聚类算法在初始中心点选择上具有盲目性而造成聚类可用性低的问题,文章提出一种BDPK-means聚类算法,该算法利用距离与簇内误差平方和的方法选取合理的初始中心点进行聚类。理论分析证明,该算法满足ε-差分隐私。实验证明,相同条件下与现有DPK-means算法相比,BDPK-means算法可提高聚类的可用性。 K-means algorithm is simple,fast and easy to implement.It is widely used in the field of data mining,but it is easy to cause privacy leakage in the process of clustering.Differential privacy has a strict definition of privacy protection,and it can be used for quantitative analysis of privacy protection.In order to solve the problem that the K-means clustering algorithm based on differential privacy has blindness in the selection of the initial center points,which results in low clustering availability,a BDPK-means clustering algorithm is proposed.The algorithm uses the distance and the sum of squared errors within the cluster to select the reasonable initial center points for clustering.The theory proves that the algorithm satisfies theε-differential privacy.Through simulation experiments,BDPK-means algorithm is compared with DPK-means algorithm under the same conditions,and the results show that BDPK-means algorithm can improve the availability of clustering.
作者 黄保华 程琪 袁鸿 黄丕荣 HUANG Baohua;CHENG Qi;YUAN Hong;HUANG Pirong(School of Computer,Electronics and Information,Guangxi University,Nanning 530004,China)
出处 《信息网络安全》 CSCD 北大核心 2020年第10期34-40,共7页 Netinfo Security
基金 国家自然科学基金[61962005]。
关键词 隐私保护 数据挖掘 差分隐私 K-MEANS聚类 误差平方和 privacy protection data mining differential privacy K-means clustering SSE
  • 相关文献

参考文献3

二级参考文献24

  • 1Blum A,Dwork C,McSherry F,et al.Practical Privacy:The SuLQ Framework[C] //24th ACM SIGMOD International Conference on Management of Data / Principles of Database Systems,Baltimore (PODS 2005).Baltimore,Maryland,USA,June 2005.
  • 2Dwork C.Differential Privacy[C] //33rd International Colloquium on Automata,Languages and Programming,part Ⅱ (ICALP 2006).Venice,Italy,Springer Verlag,July 2006.
  • 3Dwork C.Differential Privacy:A Survey of Results[C] //Theory and Applications of Models of Computation(TAMC2008).Xi'an,China,Springer Verlag,April 2008.
  • 4Dwork C.The Differential Privacy Frontier[C] //6th Theory of Cryptography Conference (TCC 2009).San Francisco,CA,Springer Verlag,March 2009.
  • 5Dwork C.Differential Privacy in New Settings[C] //Symposium on Discrete Algorithms (SODA),Society for Industrial and Applied Mathematics.Austin,TX,January 2010.
  • 6Dwork C.A Firm Foundation for Private Data Analysis[J].Communications of the ACM,2011,54 (1):86-95.
  • 7Dwork C.The Promise of Differential Privacy.A Tutorial on Algorithmic Techniques[C] // 52nd Annual IEEE Symposium on Foundations of Computer Science.Palm Springs,CA,October 2011.
  • 8Agrawal R,Strikant R.Privacy-preserving data mining[C] //Proceedings of the 2000 ACM SIGMOD International Conference on Managementof Data.Dallas,Texas,May 2000:439-450.
  • 9Sweeney L.K-anonymity:A Model for Protecting Privacy[J].International Journal on Uncertainty[J].Fuzziness and Knowledge-based Systems,2002,10 (5):557-570.
  • 10Lindell Y,Pinkas B.Privacy preserving data mining[C] // Proceedings of the 20th Annual International Cryptology Conference on Advances in Cryptology.Santa Barbara,California,August 2000:36-54.

共引文献213

同被引文献81

引证文献8

二级引证文献51

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部