期刊文献+

基于本地化差分隐私的键值数据关联分析 被引量:2

Correlation Analysis for Key-Value Data with Local Differential Privacy
下载PDF
导出
摘要 在群智感知系统中,从分布式数据源中持续收集和分析数据可以为先进的数据挖掘模型提供决策支持。由于数据中可能包含个人相关的信息,数据的采集和分析过程中通常伴随着隐私泄露的风险。本地化差分隐私作为先进的隐私保护方案可在用户的隐私性和数据的可用性之间提供较好的权衡。当前,键值数据作为异构类型数据,其同时含有分类数据和数值数据,基于本地化差分隐私在多维度下对键值数据进行关联分析面临着一定的挑战。针对隐私保护前提下键值数据的发布和关联分析问题,首先定义了键值数据的频率关联和均值关联问题,然后提出了适用于键值对的索引独热编码,为键值数据提供本地化差分隐私保护,最后在扰动的数据上对键值数据进行关联分析。基于仿真数据集和真实数据集的实验和理论分析验证了所提方案的有效性。 Crowdsourced data from distributed sources are routinely collected and analyzed to produce effective data-mining mo-dels in crowdsensing systems.Data usually contains personal information,which leads to possible privacy leakage in data collection and analysis.The local differential privacy(LDP)has been deemed as the de facto measure for trade-off between privacy guarantee and data utility.Currently,the key-value data is a kind of heterogeneous data types in which the key is categorical data and the value is numerical data.Achieving LDP for key-value data is challenging.This paper focuses on key-value data publishing and correlation analysis under the framework of LDP.Firstly,the frequency correlation and mean correlation in key-value data are defined.Then the indexing one-hot perturbation mechanism is proposed to provide LDP guarantees.At last,the correlation results can be estimated in the perturbed space.Theoretical analysis and experimental results on both real-word and synthetic dataset va-lidate the effectiveness of proposed mechanism.
作者 孙林 平国楼 叶晓俊 SUN Lin;PING Guo-lou;YE Xiao-jun(School of Software,Tsinghua University,Beijing 100084,China)
出处 《计算机科学》 CSCD 北大核心 2021年第8期278-283,共6页 Computer Science
基金 国家重点研发计划项目(2019QY1402)。
关键词 本地化差分隐私 键值数据 关联分析 均值估计 频率估计 Local differential privacy Key-value data Correlation analysis Mean estimation frequency estimation
  • 相关文献

参考文献2

二级参考文献73

  • 1杨晓春,刘向宇,王斌,于戈.支持多约束的K-匿名化方法[J].软件学报,2006,17(5):1222-1231. 被引量:60
  • 2Samarati P. Protecting Respondents Identities in Microdata Release [J]. IEEE Transactions on Knowledge and Data Engineering, 2001,13(6) : 1010-1027.
  • 3Sweeney L. Achieving k-anonymity privacy protection using generalization and suppression[J]. International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, 2002, 10 (5)1571-588.
  • 4Wang K, Fung B C M. Anonymizing sequential releases[C]// Proceedings of KDD 2006. Philadelphia, PA, USA:ACM, 2006:414-423.
  • 5Fung B C M, Wang K, Chen R, et al. Privacy-preserving data publishing:A survey of recent developments [J]. ACM Comput. Surv. ,2010,42(4) :1-53.
  • 6Nergiz M E, Clifton C, Nergiz A E. Multirelational k-anonymity [C] // Proceedings of ICDE ' 07. Istanbul, Turkey, 2007: 1417- 1421.
  • 7Machanavajihala A, Kifer D, Gehrke J, et al.l-diversity.. Privacy beyond k-anonymity[J-. ACM Transactions on Knowledge Discovery from Data, 2007,1 (1).
  • 8Wong R,Li J,Fu A,et al. (a,k)-anonymity..an enhanced k-anonyrnity model for privacy preserving data publishing [C]// Proceedings of KDD 2006. ACM, 2006:754-759.
  • 9Zhang Q, Koudas N, Srivastava D, et al. Aggregate query answering on anonymized tablesFC-ffProeeedings of ICDE'07. Istanbul, Turkey, 2007 : 116-125.
  • 10Li J, Tao Y, Xiao X. Preservation of proximity privacy in publishing numerical sensitive data [C]//Proc. ACM SIGMOD Int. Conf. Manage. Data. Vancouver, Canada: ACM, 2008 : 473-486.

共引文献25

同被引文献26

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部