摘要
在现有的基于差分隐私保护的直方图发布聚类处理算法中,没有算法考虑对方差较小与方差较大的直方图计数集加以区别对待,从而在处理方差较小的直方图计数集时造成算法复杂度过大。针对方差较小的直方图计数集,提出一种基于临近箱计数差值的分割策略。首先,通过计算相邻单位箱计数的差值确定分割边界;然后,根据重构误差与加噪误差的总量变化判断每次分割的可行性;最后,通过理论分析和实验仿真,该算法在保证发布数据准确度的同时,极大地提高了算法效率,从而验证了该算法的有效性。
In current clustering processing algorithms of histogram publishing based on differential privacy preserving,there was not algorithm which differentiate between histogram counts with smaller variance and histogram counts with larger variance.And it caused the waste of algorithm complexity when dealing with histogram with smaller variance. This paper proposed a segmentation algorithm based on the D-value of adjacent bins. It determined the segmentation boundary through calculating the Dvalue of adjacent unit bins and then determined the feasibility of each division based on the sum value of reconstruction error and the amount of noise. It verified the effectiveness of the proposed algorithm through the experimental simulation and theoretical analysis. This algorithm ensures the accuracy of data released. What’s more important is that it greatly improves the efficiency of the algorithm.
出处
《计算机应用研究》
CSCD
北大核心
2014年第12期3700-3703,3710,共5页
Application Research of Computers
基金
中央高校基本科研业务费资助项目(JUSRP111A49)
关键词
差分隐私
直方图发布
聚类处理
算法复杂度
差值
分割边界
differential privacy
histogram publishing
clustering processing
algorithm complexity
D-value
segmentation boundary