隐私保护的快速聚类算法

Fast privacy-preserving clustering algorithm

下载PDF

导出

摘要针对基于安全多方计算聚类算法的低效问题,提出了基于聚类特征树结构的隐私保护的层次k-means聚类算法。算法基于半诚信模型,在第三方内存中保留对各记录的索引信息及聚类特征树的当前层信息,减少了I/O次数和通信量,克服了难以适应多数据方和因过于信赖第三方导致隐私泄漏等缺陷。算法通过基于安全多方计算的标准化协议、距离计算协议和聚类中心计算协议,实现了数据的有效保护,综合层次和k-means聚类算法的优点,提高了计算精度和算法的可伸缩性。理论证明了算法的安全性和高效性,实验结果表明所提算法优于同类算法。 To improve the efficiency of a clustering algorithm based on secure multi party computation, a privacy preserving hierarchical k means algorithm based on clustering feature tree and semi-honest model is proposed. The algorithm stores the index information of every record and current hierarchical information of the clustering feature tree in the third party＇s memory and reduces the I/O times and communication cost, it also overcomes the drawbacks of applying to multi-party difficultly and leaking privacy due to depending on the third party excessively. The algorithm introduces three secure protocols such as distance computation, clustering center computation and standardization to protect data privacy effectively and accurately, and improves the precision and flexibility by combining the merits of hierarchical and k-means clustering algorithms. Theoretic argument demonstrates that the algorithm is secure and completes with good efficiency. The experimental results show that the proposed algorithm outperforms the other existing algorithms in communication cost and computation cost.

作者薛安荣姜冬洁鞠时光陈伟鹤马汉达

机构地区江苏大学计算机科学与通信工程学院炎黄职业技术学院

出处《系统工程与电子技术》 EI CSCD 北大核心 2009年第10期2521-2526,共6页 Systems Engineering and Electronics

基金国家自然科学基金(60773049 60603041) 江苏大学高级人才启动基金(09JDG041)资助课题

关键词隐私保护数据挖掘安全多方计算聚类特征树相异矩阵 privacy preserving data mining secure multi-party computation clustering feature tree dissimilarity matrix

分类号 TP309 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献12

1Agrawal R, Srikant R. Privacy-preserving data mining[C]//Proc. of the ACM SIGMOD Conference on Management of Data, Dallas, Texas, 2000 : 439 - 450.
2Lindell Y, Pinkas B. Privacy preserving data mining[C]//Proc. of Advances in Cryptology-CRYPTO, Lecture Notes in Computer Science, Sprlnger-Verlag, 2000,1880 : 36 - 53.
3Kantareioglu M , Clifton C. Privacy - preserving distributed mining of association rules on horizontally partitioned data [J]. IEEE Trans. on Knowedge & Data Engineering, 2004,16(9) : 24 - 31.
4Vaidya J, Clifton C. Privacy preserving association rule mining in vertically partitioned data[C]// Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ,2002:639 - 644.
5Vaidya J, Clifton C. Privacy-preserving k-means clustering over vertically partitioned data[C]// Proc. of the 9th ACM S IGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 2003:206 - 215.
6Jha S, Kruger L, McDaniel P. Privacy preserving clustering[C]// Proc. of 10th European Symposium on Research in Computer Security, Milan, Italy, 2005:397 - 417.
7Inan Ali, Kaya SV, Saygin Y. Privacy preserving clustering on horizontally partitioned data[J]. Data & Knowledge Engineering, 2007,63(3) :646 - 666.
8Jagannathan G, Pillaipakkaamnatt K, Wright R. A new privacy-preserving distributed k clustering algorithm [C] // Proc. of SIAM International Conference on Data Mining, 2006: 492 -496.
9Trottini M, Feinberg SE. Modelling user uncertainty for disclosure risk and data utility[J]. International Journal of Uncertainty ,Fuzziness and Knowledge-Based Systems, 2002, 10(5) : 511 - 527.
10Goldreich O. Foundations of cryptography: basic applications [M]. London: Cambridge University Press, 2004.

1邵峰晶,张斌,于忠清.多阈值BIRCH聚类算法及其应用[J].计算机工程与应用,2004,40(12):174-176. 被引量：17
2倪曼蒂,覃拥军.基于Web日志挖掘的用户模式识别研究[J].现代计算机,2013,19(11):14-17.
3陈绍彬,叶飞跃,刘佰强,金涛.食品HACCP分类的BIRCH算法[J].计算机工程,2008,34(23):59-61. 被引量：3
4曹磊.网格技术与标准化协议在异构资源互操作中的应用[J].宿州学院学报,2010,25(2):50-51. 被引量：1
5赵凯,史长琼,张理阳.基于聚类分析的P2P流量识别[J].长沙理工大学学报（自然科学版）,2010,7(3):58-62. 被引量：3
6冯兴杰,丁怡心,廖勇毅.基于XML的可继承BIRCH研究[J].计算机工程,2009,35(2):53-54.
7张阳,申华.基于近邻用户和近邻项目的协同过滤改进算法[J].沈阳师范大学学报（自然科学版）,2012,30(3):382-385.
8陈冬岩.基于多信道的MAC层协议在无线传感器网络中的应用[J].山东大学学报（工学版）,2009,39(1):41-49. 被引量：7
9吕鑫.磁共振成像系统远程维护平台[J].计算机光盘软件与应用,2010(11):101-101.
10TwinCAT IoT：高速、标准化的云通讯将数据和通讯服务高效集成在云端中[J].国内外机电一体化技术,2016,0(1):12-12.

系统工程与电子技术

2009年第10期

浏览历史

内容加载中请稍等...

隐私保护的快速聚类算法

参考文献12

相关作者

相关机构

相关主题

浏览历史