摘要
基于分类树的差分隐私保护方法有效地对静态集值型数据进行了保护,但对于动态集值型数据却没有相应的保护方法,因此提出一种基于分类树的差分隐私保护下的动态集值型数据发布的算法。该算法首先根据数据集中项的全集构造关系矩阵,挑选关系最紧密的项集构造分类树;然后设定一个边界值来限制数据的增量更新,并将新增的记录添加到分类树的根节点中,按照初始分类树的分配法迭代分配每个记录;最后根据拉普拉斯机制向叶子节点中加入噪音,保证整个算法满足差分隐私的要求。相对已有算法,所提算法优化了分类树,使所发布数据建立的分类树模型有少量的叶子节点产生,减少了噪音的添加。实验用两组真实的数据集验证了所提算法的有效性和相对于其他算法的优越性。
Differential privacy protection method based on taxonomy tree is effective for static set-valued of data protec- tion, but it doesn't have corresponding protection method for dynamic set-valued data. So, this paper presented a classi- fication tree based analysis of differential protection of privacy under dynamic set-valued data released. Firstly, according to the complete set structure relation matrix of the data set, this algorithm chooses the most closely related item set, and constructs the taxonomy tree. And then a boundary value is set to limit the incremental data update, and the new record is added to the root node of the taxonomy tree,in accordance with the initial taxonomy tree distribution method itera- tively assigns each record. Finally, according to the Laplace mechanism, noise is added into leaf node to ensure that the algorithm satisfies differential privacy requirements. Compared with the existing algorithms, this algorithm optimizes the taxonomy tree,so that the release of data taxonomy tree model is established with small leaf nodes, reducing the noise added. Experiment with two real datasets shows that the algorithm is effective and performs better than existing algo- rithms.
出处
《计算机科学》
CSCD
北大核心
2017年第5期120-124,165,共6页
Computer Science
关键词
隐私保护
分类树
动态集值型数据
增量更新
Privacy protection,Taxonomy tree,Dynamic set-valued data, Incremental update