摘要
针对数据企业对用户信息以及开放数据趋势下政府数据发布工作对脱敏保护算法的迫切需求,文章提出一种基于差分隐私保护理论的具有属性段首选机制和基于聚类算法的贝叶斯网络改进型算法FCPrivBayes。该算法避免了对首个属性段属性的随机化选择,并用聚类的方法取代等宽法对数据进行离散化处理。实验数据表明,在保障数据隐私的前提下,FCPrivBayes有效提升了数据的可用性指标,为企业保护数据、政府发布数据提供了新的技术方案,有利于用户隐私保护工作的推进和大数据产业的发展。
In response to the urgent need for desensitization protection algorithms by the data companies and open government publishing data,under the strict differential privacy theory,an improved Bayesian network algorithm FCPrivBayes with an attribute segment preference mechanism and a clustering algorithm is proposed,which avoids the random selection of the attributes of the first attribute segment,and uses the clustering method to replace the equal-width method to discretize the data.Experimental data show that FCPrivBayes effectively improves data utility indicators while ensuring the data privacy protection effect.Which provides new technical options for data companies to protect data and for government to release data,and benefits the user privacy protection and the development of the big data industry.
作者
肖彪
闫宏强
罗海宁
李炬成
XIAO Biao;YAN Hongqiang;LUO Haining;LI Jucheng(Beijing Jiaotong University,Beijing 100044,China;Computer Network Information Center,Chinese Academy of Sciences,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 100049,China;National Information Center,Beijing 100045,China;School of Transportation Engineering,Dalian Maritime University,Dalian 116000,China)
出处
《信息网络安全》
CSCD
北大核心
2020年第11期75-86,共12页
Netinfo Security
基金
国家重点研发计划[2017YFB0801902,2018YFB2101501]。
关键词
差分隐私
贝叶斯网络算法
隐私保护
differential privacy theory
Bayesian network algorithm
privacy protection