摘要
针对传统K-means聚类算法聚类过程以及聚类结果公示时可能出现隐私泄露的问题,提出具有差分隐私保护的改进K-means算法。在原有K-means基础上引入密度度量,提高簇类的类内相似性,保证选取的中心处于相对密集区域;引入距离度量,降低簇类的类间相似性,保证不同类聚中心排斥性较高;引入类间平均最大相似度,动态规划最佳聚类个数K和最佳初始类内中心;引入了隐私保护拉普拉斯噪声,保护信息的安全性。实验结果表明,该算法比传统算法具有更高的聚类可用性和数据可靠性。
Aiming at the problem of privacy disclosure in the clustering process of traditional K-means clustering algorithm and the publicity of clustering results,an improved K-means algorithm with differential privacy protection was proposed.On the basis of the original K-means,density measurement is introduced to improve the in-class similarity of clusters and ensure that the selected centers are in relatively dense areas.The distance measure is introduced to reduce the similarity between clusters and ensure the high repulsion of different cluster centers.The average maximum similarity between classes is introduced,and the optimal number of clusters K and the optimal initial intra-class center are dynamically programmed.Privacy protection Laplacian noise is introduced to protect information security.Experimental results show that this algorithm has higher cluster availability and data reliability than traditional algorithms.
作者
王彩鑫
王丽丽
杨洪勇
Wang Caixin;Wang Lili;Yang Hongyong(School of Information and Electrical Engineering,Ludong University,Yantai 264025,China)
出处
《兵工自动化》
2023年第12期38-45,共8页
Ordnance Industry Automation
基金
国家自然科学基金(61673200)
山东省重大基础研究项目(ZR2018ZC0438)。