摘要
针对现有的海量数据分析和数据挖掘过程中,无法应对任意背景知识下的恶意攻击而造成用户隐私数据泄露的问题,在Spark大数据内存并行计算框架的基础上引入差分隐私保护机制,对模式挖掘过程中的敏感信息进行Laplace加噪处理,提出一种适合于在Spark框架下满足差分隐私保护的Apriori关联分析算法。该算法利用差分隐私的组合特性,从理论上证明了算法满足£一差分隐私特性,并且指导了隐私保护预算的分配过程。通过实验表明,提出的算法比在MapReduce框架下实现支持隐私保护的Apriori算法迭代效率更高、安全性更好,同时在保证可用性前提下,算法具有较好的隐私保护特性和良好的时效性。
Aiming at the problem that traditional methods fail to deal with malicious attacks with arbitrary background knowledge during the process of massive data analysis, we pro- pose an improved Apriori algorithm preserving differential privacy, combining with Laplace mechanism to mine the pattern of sensitive information in framework of Spark. Further- more, it's theoretically proved to meet e-differential privacy in spark. Finally, experimental results show that guaranteeing availability, improved algorithm has an advantage over priva- cy protection and satisfaction in time as well as efficiency on the premise of quaranteeing a- vailability. Most importantly,algorithm shows a good application prospect in the analysis of data pattern mining preserving privacy protection. Also, it has better privacy protection and good timeliness on the premise of ensuring availability.
作者
李庆鹏
张龙军
李昊宇
LI Qingpeng ZHANG Longjun LI Haoyu(Postgraduate Brigade Department of Information Engineering, Engineering University of PAP, Xi'an 710086, China)
出处
《武警工程大学学报》
2017年第2期22-25,共4页
Journal of Engineering University of the Chinese People's Armed Police Force
关键词
内存计算框架
差分隐私
关联分析
模式挖掘
关联规则算法
spark
differential privacy
association analysis
pattern mining
association rule algorithm