摘要
随着大数据的运用不断发展,数据中的个人敏感信息面对的泄露风险越来越大。在发布数据时,可以通过数据脱敏来保护个人敏感信息。当前主流的脱敏技术有k-匿名、l-多样性和t-保密三种,都没有对数据语义的考虑。为了更好地保护复杂语义下高敏感度的敏感属性值,文中选用t-保密脱敏技术,以海林格距离作为度量方式,通过敏感属性值分类加权引入铭感信息度量。数据分析及实验结果表明,该方法在可接受的脱敏时间开销增长下,加强了对复杂语义的敏感数据的保护能力。同时分类加权方式方便灵活,可以满足实际使用中的不同需求。
With the continuous development of the application of big data,the risk of disclosure of sensitive personal information in data is increasing.Data masking can be used to protect sensitive personal information when releasing data.Currently,there are three mainstream data masking tech-nologies,k-anonymity,l-diversity and t-closeness,none of which takes data semantics into consideration.In order to protect the sensitive attribute values with high sensitivity under complex semantics,t-closeness technology is selected in this paper,and Helinger distance is used as the measurement method to introduce the measurement of sensitivity information by classifying and weighting the sensitive attribute values.The data analysis and experimental results show that this method enhances the ability to protect sensitive data with complex semantics under the increase of acceptable desensitization time cost.At the same time,the method of classification weighting is convenient and flexible,which can better meet the different needs in actual use.
作者
吴克河
朱海
李为
崔文超
张晓亮
程瑞
WU Ke-he;ZHU Hai;LI Wei;CUI Wen-chao;ZHANG Xiao-liang;CHENG Rui(North China Electric Power University,Beijing 102206,China)
出处
《信息技术》
2019年第11期5-9,共5页
Information Technology
基金
国家电网科技项目(521304190004)