摘要
近年来隐私保护下的数据挖掘发展迅速,但应用广泛的数据可视化中的隐私保护问题则成果鲜见,差分隐私保护是一种新兴的具有广阔发展前景的隐私保护方法,目前,差分隐私保护下的多维数据可视化方法却未见报道.文章研究如何在数据可视化的过程中满足差分隐私保护.现有的DP k-means算法不支持较大的k,因此在数据聚合的过程中仅有理论意义.提出一个ε-Differential Privacy Equipartition k-means算法(DPE k-means),能够支持较大的k,较好地解决了可视化中数据的叠加问题,在一定的隐私保护级别下极大地改善了数据可视化后的图像质量.仿真实验中计算了衡量数据聚合质量的几项指标,结果表明DPE k-means算法优于现有的DP k-means算法.
Privacy preserving data mining developed rapidly in recent years,on the other hand,there is a dearth of research on privacy preserving data visualization,w hich have w ide range of applications.Differential privacy is a new promising privacy-preserving paradigm,in fact,w e are not aw are of any existing multidimensional data visualization method under differential privacy.In this paper,w e study how to preserve priavcy in the process of data visualization.existing DP k-means algorithm is mainly of theoretical interest because it doesn't w ork at large k w hich is necessary in data aggregation.Motivated by this,w e propose ε-Differential Privacy Equipartition k-means(DPE k-means),a method w hich w ork better at large k.w e find it eliminate a majority of data overlapping,greatly improve the visualization image quality under a certain privacy level.Our experiments show that at the same ε,DPE k-means gets a much higher aggregation quality level than existing DP k-means method.
出处
《小型微型计算机系统》
CSCD
北大核心
2013年第7期1637-1640,共4页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61070033)资助
广东省自然科学基金项目(9251009001000005)资助
广东省科技计划项目(2010B050400011)资助
关键词
差分隐私保护
K-均值
数据聚合
数据可视化
平行坐标
differential privacy preservation
k-means
data aggregation
data visualization
parallel coordinates