期刊文献+

一种考虑属性权重的隐私保护数据发布方法 被引量:17

A QI Weight-Aware Approach to Privacy Preserving Publishing Data Set
下载PDF
导出
摘要 k-匿名模型是数据发布领域用于对原始待发布数据集进行匿名处理以阻止链接攻击的有效方法之一,但已有的k-匿名及其改进模型没有考虑不同应用领域对匿名发布表数据质量需求不同的问题.在特定应用领域不同准码属性对基于匿名发布表的数据分析任务效用的贡献程度是不同的,若没有根据发布表用途的差异区别处理各准码属性的泛化过程,将会导致泛化后匿名发布表数据效用较差、无法满足具体数据分析任务的需要.在分析不同应用领域数据分析任务特点的基础上,首先通过修正基本ODP目录系统建立适用于特定问题领域的概念泛化结构;然后在泛化过程中为不同准码属性的泛化路径设置权重以反映具体数据分析任务对各准码属性的不同要求;最后设计一种考虑属性权重的数据匿名发布算法WAK(QI weight-aware k-anonymity),这是一种灵活地保持匿名发布表数据效用的隐私保护问题解决方案.示例分析和实验结果表明,利用该方案求解的泛化匿名发布表在达到指定隐私保护目标的同时,能够保持较高的数据效用,满足具体应用领域特定数据分析任务对数据质量的要求. In recent years, publishing data about individuals without revealing their identity information has become an active issue, and k-anonymity based models are the effective techniques that can prevent linking attacks. Most of the previous works, however, focus on the efficiency and the scope of application of the models. Specific requirements of quality of published microdata for the analyzing task in various scenarios and the difference of contributions of each QI attribute to the result have not been addressed. If the contribution of different generalizing paths and orders of QI attributes has not been considered, the published microdata may have bad utility in the application. Paying more attention to them, which makes the published table have different utility, is valuable. By analyzing the differences among several application areas, a scheme which provides an effective and secure tradeoff of privacy and utility, is proposed. Firstly the basic ODP is revised to indicate the characters of special domain. Secondly, the weight on quasi-attribute is introduced to reflect the effect for the data analyzing task. And then QI weight-aware k-anonymity (WAK), which is an algorithm based on the weight of attribute, is introduced. Theoretical analysis and experimental results testify that the scheme is effective and can preserve privacy of the sensitive data well, meanwhile maintaining better data utility.
出处 《计算机研究与发展》 EI CSCD 北大核心 2012年第5期913-924,共12页 Journal of Computer Research and Development
基金 国家自然科学基金项目(60673127) 国家"八六三"高技术研究发展计划基金项目(2007AA01Z404) 高等学校博士学科点专项科研基金项目(20103218110017) 江苏省科技支撑计划基金项目(BE2008135) 安徽高校省级自然科学研究重大项目(KJ2010ZD01)
关键词 数据发布 隐私保护 权重 K-匿名 泛化 data publishing privacy preserving weight k-anonymity generalization
  • 相关文献

参考文献16

  • 1Sieg A,Mobasher B,Burke R. Web search personalization with ontological user profiles[A].New York:ACM,2007.525-534.
  • 2Shang Ning,Paci F,Nabeel M. A privacy-preserving approach to policy-based content dissemination[A].Piscataway,NJ:IEEE,2010.944-955.
  • 3Inan A,Kantarcioglu M,Bertino E. Using anonymized data for classification[A].Piscataway,NJ:IEEE,2009.429-440.
  • 4Kohavi R,Becker B. UCI machine learning reposity[OL].http://archive.ics.uci.edu/ml/datasets/Adult,2010.
  • 5Sweeney L. Achieving k-anonymity privacy protection using generalization and suppression[J].International Journal on Uncertainty Fuzziness and Knowledge-Based Systems,2002,(05):571-588.
  • 6Li Ninghui,Li Tiancheng,Venkatasubramanian S. T-closeness:Privacy beyond k-anonymity and l-diversity[A].Los Alamitos,CA:IEEE Computer Society,2007.106-115.
  • 7Skrenta R. Open directory project[OL].http://www.dmoz.org/,2010.
  • 8Sweeney L. K-anonymity:A model for protecting privacy[J].Journal on Uncertainty Fuzziness and Knowledge-based Systems,2002,(05):557-570.
  • 9Xu Jian,Wang Wei,Pei Jian. Utility-based anonymization using local recoding[A].New York:ACM,2006.785-790.
  • 10刘玉葆,黄志兰,傅慰慈,印鉴.基于有损分解的数据隐私保护方法[J].计算机研究与发展,2009,46(7):1217-1225. 被引量:21

二级参考文献40

  • 1倪巍伟,孙志挥,陆介平.k-LDCHD——高维空间k邻域局部密度聚类算法[J].计算机研究与发展,2005,42(5):784-791. 被引量:18
  • 2葛伟平,汪卫,周皓峰,施伯乐.基于隐私保护的分类挖掘[J].计算机研究与发展,2006,43(1):39-45. 被引量:20
  • 3杨晓春,刘向宇,王斌,于戈.支持多约束的K-匿名化方法[J].软件学报,2006,17(5):1222-1231. 被引量:60
  • 4张鹏,童云海,唐世渭,杨冬青,马秀莉.一种有效的隐私保护关联规则挖掘方法[J].软件学报,2006,17(8):1764-1774. 被引量:53
  • 5Kantarcioglu M, Jin Jiasun, Clifton C. When do data mining results violate privacy [C]//Proc of the 10th ACM SIGKDD on Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2004:599-604
  • 6Agrawal R, Srikant R. Privacy-preserving data mining [C]// Proc of the 2000 ACM SIGMOD Conf on Management of Data. New York: ACM, 2000:439-450
  • 7Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi, et al. Approximation algorithms for k knonymity [C] //Proc of ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2007:67-78
  • 8Du Yang, Xia Tian, Tao Yufei, et al. On multidimensional k-anonymity with local recoding generalization [C] //Proc of IEEE 23rd Int Conf on Data Engineering. Los Alamitos: IEEE Computer Society, 2007:1422-1424
  • 9Tao Yufei, Xiao Xiaokui, Li Jiexing, et al. On anti corruption privacy preserving publication [C]//Proc of the 24th Int Conf on Data Engineering (ICDE). Los Alamitos: IEEE Computer Society, 2008:725-734
  • 10Oliveira S R M, Zaiane O R. Privacy preservation when sharing data for clustering [C]//Proc of the Int Workshop on Secure Data Management in a Connected World. Berlin: Springer, 2004: 67-82

共引文献53

同被引文献295

引证文献17

二级引证文献310

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部