摘要
基于分组的差分隐私直方图发布得到了研究者的广泛关注,组均值造成的近似误差与噪音造成的拉普拉斯误差之间的均衡直接制约着直方图发布精度,针对现有基于分组的直方图发布方法难以有效兼顾近似误差与拉普拉斯误差的不足,提出了一种满足差分隐私的精确直方图发布方法DiffHR(differentially private histogram release);通过分析直方图桶计数序列的排序有助于提升发布精度,利用Markov链蒙特卡洛(Markov chain Monte Carlo,MCMC)方法中的Metropolis-Hastings技术与指数机制,提出了一种有效排序方法,通过不断置换2个随机选取的桶以逐渐逼近正确排序;基于抽样排序后的直方图,提出了一种基于懒散分组下界的自适应贪心聚类方法,该方法的时间复杂度为O(n),并且可有效均衡近似误差与拉普拉斯误差.DiffHR,GS,AHP方法在真实数据上的实验结果表明,其发布精度上优于同类算法.
Grouping-based differentially private histogram release has attracted considerable research attention in recent years .The trade-off between approximation error caused by the group’s mean and Laplace error due to Laplace noise constrains the accuracy of histogram release . Most existing methods based on grouping strategy cannot efficiently accommodate the both errors . This paper proposes an efficient differentially private method ,called DiffHR (differentially private histogram release) to publish histograms .In order to boost the accuracy of the released histogram ,DiffHR employs Metropolis-Hastings method in MCMC (Markov chain Monte Carlo ) and the exponential mechanism to propose an efficient sorting method . This method generates a differentially private histogram by sampling and exchanging two buckets to approximate the correct order . To balance Laplace error and approximation error efficiently , a utility-driven adaptive clustering method is proposed in DiffHR to partition the sorted histogram . Furthermore , the time complexity of the clustering method is O(n) .DiffHR is compared with existing methods such as GS ,AHP on the real datasets .The experimental results show that DiffHR outperforms its competitors ,and achieves the accurate results .
出处
《计算机研究与发展》
EI
CSCD
北大核心
2016年第5期1106-1117,共12页
Journal of Computer Research and Development
基金
国家自然科学基金项目(61502146
61379050
U1404605
61202285)
国家"八六三"高技术研究发展计划基金项目(2013AA013204)
河南省科技厅基础与前沿技术研究项目(152300410091)
河南省教育厅高等学校重点科研项目(16A520002)
河南财经政法大学校重大研究课题(201426)~~
关键词
差分隐私
直方图发布
分组
拉普拉斯误差
近似误差
differential privacy
histogram release
grouping
Laplace error
approximation error