摘要
当今时代,随着大数据技术的飞速发展和数据量的持续增加,大量数据不断被不同的公司或者机构收集,把来自不同公司或机构的数据聚合起来并发布,有助于更好地提供服务、支持决策。然而他们各自的数据中可能包含敏感程度不同的隐私信息,所以在聚合发布各方数据时需要满足个性化隐私保护要求。针对个性化隐私保护的多方数据聚合发布问题,该文提出满足个性化差分隐私的多方垂直划分数据合成机制(PDP-MVDS)。该机制通过生成低维边缘分布实现对高维数据的降维,用低维边缘分布更新随机初始的数据集,最终发布和各方的真实聚合数据集分布近似的合成数据集;同时通过划分隐私预算实现个性化差分隐私保护,利用安全点积协议和门限Paillier加密保证各方数据在聚合过程中的隐私性,利用分布式拉普拉斯机制有效保护了多方聚合边缘分布的隐私。该文通过严格的理论分析证明了PDP-MVDS能够确保每个参与方数据和发布数据集的安全。最后,在公开数据集上进行了实验评估,实验结果表明PDP-MVDS机制能够以低开销生成高效用的多方合成数据集。
In today’s era,with the rapid development of big data technology and the continuous increase in data volume,large amounts of data are constantly collected by different companies or institutions,aggregating and publishing data owned by different companies or institutions helps to better provide services and support decision-making.However,their respective data may contain privacy information with different degrees of sensitivity,thus personalized privacy protection requirements need to be met while aggregating and publishing data from all parties.To solve the problem of multi-party data publication while ensuring that different privacy protection needs of all parties are met,a Multi-party Vertically partitioned Data Synthesis mechanism with Personalized Differential Privacy(PDP-MVDS)is proposed.Low-dimensional marginal distributions are firstly generated to reduce the dimension of high-dimensional data,then a randomly initialized dataset with these marginal distributions are updated,and finally a synthesized dataset whose distribution is similar to that of the real aggregated dataset from all parties is published.Personalized differential privacy protection is achieved by dividing the privacy budget;Secure scalar product protocol and threshold Paillier encryption algorithm are used to ensure the privacy of each party’s data in the aggregation process;Distributed Laplace perturbation mechanism is used to effectively protect the privacy of marginal distributions that aggregated from those parties.Through rigorous theoretical analysis,it is proved that PDP-MVDS can ensure the security of each participant’s data and the finally published dataset.Furthermore,the experimental results on public datasets show that PDP-MVDS mechanism can obtain a multi-party synthesized dataset with high utility under low overhead.
作者
朱友文
王珂
周玉倩
ZHU Youwen;WANG Ke;ZHOU Yuqian(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China)
出处
《电子与信息学报》
EI
CAS
CSCD
北大核心
2024年第5期2159-2176,共18页
Journal of Electronics & Information Technology
基金
国家重点研发计划(2021YFB3100400)
国家自然科学基金(62172216)
江苏省自然科学基金(BK20211180)。
关键词
隐私保护
多方数据发布
安全多方计算
个性化差分隐私
垂直划分数据
Privacy protection
Multi-party data publication
Secure Multi-Party Computation(SMPC)
Personalized Differential Privacy(PDP)
Vertically partitioned data