摘要
为更好的抵御背景知识攻击和同质攻击,保护特定的敏感值或全部敏感值,定义了单敏感值(,αk)-匿名模型和多敏感值(,αk)-匿名模型,并分别设计了两个聚类算法予以实现,同时分析了算法的正确性和复杂性.对于即包含连续属性又包含分类属性的数据集,给出了数据集的详细映射与处理方法,使数据集中点的距离可以方便的计算,彻底避免了把数据点距离和信息损失混淆的情况.详细的理论分析和大量的实验评估表明算法有较小的信息损失和较快的执行时间.
To better protect personal privacy against background knowledge attack and homogeneity attack,single sensitive value and multi sensitive values(α,k)-anonymity models were defined respectively.For achieving this purpose,two clustering algorithms were designed.At the same times,we made correctness and complexity analysis for the algorithms.Since the data sets contain continuous attributes and classification attributes,a detailed mapping and processing method was given,that make the distance between data points can calculate easily,and avoid completely the case that confusion data points distance and information loss.Experiment results and detailed theory analysis demonstrate that our methods are effective on both information loss and execution time comparing with existing methods.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2011年第8期1941-1946,共6页
Acta Electronica Sinica
基金
国家自然科学基金(No.61073043,No.61073041,No.60873037)
黑龙江省自然科学基金(No.F200901)
关键词
数据发布
K-匿名
l-多样性
隐私保护
聚类
data publishing
k-anonymity
l-diversity
privacy preserving
clustering