期刊文献+

基于互联网大数据的脱敏分析技术研究 被引量:14

Data Masking Analysis Based on Internet Big Data
原文传递
导出
摘要 【目的】基于现有的脱敏技术,改进匿名组的划分效果,得到较优的脱敏模型及算法。【方法】基于k-匿名技术,改进维度划分标准,以KD树作为存储结构,构造新算法。利用Python实现程序,比较所产生的匿名组数量、NCP百分比,验证算法的可行性与有效性。【结果】新算法能够使得脱敏后整个数据集所生成的匿名组个数达到最大。且NCP百分比低于同类算法。【局限】对于有某一属性离散程度显著的数据集,循环计算划分维度较为繁琐。【结论】新算法相比于传统算法增加了匿名组个数,相比于同类算法,信息损失较低。 [Objective] This paper aims to improve the classification results of anonymous groups and then obtain better data masking model and algorithm. [Methods] First, we modified the dimension judgment standards based on k-anonymity. Then, we used the KD tree as storage structure to construct a new algorithm. Third, we implemented the proposed algorithm with Python. Finally, we examined the feasibility and effectiveness of the new algorithm with the number of anonymous groups and the percentage of NCP. [Results] The new algorithm could maximize the number of anonymous groups generated by the whole dataset, while the percentage of NCP was lower than similar algorithms. [Limitations] For datasets with significant degree of dispersion, the dimension of the loop computation was cumbersome. [Conclusions] The proposed algorithm could improve the availability of the anonymous groups and reduce the data loss.
作者 周倩伊 王亚民 王闯 Zhou Qianyi ,Wang Yamin ,Wang Chuang(School of Economics and Management, Xidian University, Xi'an 710126, Chin)
出处 《数据分析与知识发现》 CSSCI CSCD 北大核心 2018年第2期58-63,共6页 Data Analysis and Knowledge Discovery
关键词 数据脱敏 K-匿名模型 取整划分 Data Masking k-anonymity Integer Division
  • 相关文献

参考文献6

二级参考文献54

  • 1杨晓春,刘向宇,王斌,于戈.支持多约束的K-匿名化方法[J].软件学报,2006,17(5):1222-1231. 被引量:60
  • 2Bayardo R J, Agrawal R. Data privacy through optimal k-anonymization. In: Aberer K, Franklin M, Nishio S, eds. Proc. of the 21 st IEEE lnt'l Conf. on Data Engineering. Washington: IEEE Computer Society, 2005. 217-228. [doi: 10.1109/ICDE.2005.42].
  • 3Samarati P, Sweeney L. Protecting privacy when disclosing information: k-Anonymity and its enforcement through generalization and suppression. Technical Report, SRI Int'l, 1998.
  • 4Sweeney L. Achieving k-anonymity privacy protection using generalization and suppression. Int'l Journal on Uncertainty, Fuzziness, and Knowledge-Based Systems, 2002,10(5):571-588. [doi: 10.1142/S021848850200165X].
  • 5Sweeney L. k-Anonymity: A model for protecting privacy. Int'l Journal on Uncertainty, Fuzziness and Knowledge-Based Systems, 2002,10(5):557-570. [doi: 10.1142/S0218488502001648].
  • 6Xu Y, Wang K, Fu AWC, Yu PS. Anonymizing transaction databases for publication. In: Li Y, Liu B, Sarawagi S, eds. Proc. of the 14th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery, 2008. 767-775. [doi: 10.1145/1401890.1401982].
  • 7Terrovitis M, Mamoulis N, Kalnis P. Anonymity in unstructured data. Technical Report. Hong Kong: Hong Kong University, 2008.
  • 8Fung BCM, Wang K, Yu PS. Top-Down specialization for information and privacy preservation. In: Aberer K, Franklin M, Nishio S, eds. Proc. of the 21st IEEE Int'l Conf. on Data Engineering. Washington: IEEE Computer Society, 2005. 205-216. [doi: 10.1109/ICDE.2005.143 ].
  • 9Fung BCM, Wang K, Chen R, Yu PS. Privacy-Preserving data publishing: A survey on recent developments. ACM Computing Surveys, 2010,42(4): 1-53. [doi: 10.1145/1749603.1749605].
  • 10Iyengar VS. Transforming data to satisfy privacy constraints. In: Hand D, Keim D, Ng R, eds. Proe. of the 8th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery, 2002. 279-288. [doi: 10.1145/775047.775089].

共引文献79

同被引文献155

引证文献14

二级引证文献100

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部