摘要
数据脱敏,是指对数据中包含的一些涉及机密或隐私的敏感信息进行特殊处理,以达到保护私密及隐私信息不被恶意攻击者非法获取的目的。保形加密是众多数据脱敏技术的一种,但其具有保持原始数据格式不变的重要优势,从而在一定程度上对上层应用透明。随着大数据时代的到来以及Hadoop平台的广泛应用,传统的基于关系型数据库的数据脱敏技术已不能满足实际的生产需要。针对Hadoop大数据平台实现了一种基于保形加密的数据脱敏系统,支持对多种数据存储格式以及纯数字、纯字母或数字—字母混合等多种数据类型敏感数据的加密脱敏处理。然后对3种不同的实现方式进行了探讨,并开展了一系列实验对系统的加密脱敏性能进行详细的评估比较。
Data desensitization is a process that makes some special transformations on sensitive data in order to protect the secrecy and privacy from being acquired by malicious attackers. Format-preserving encryption is one of the techniques of data desensitization, which has the advantage of keeping data format unchanged so that the upper layer applications are not affected. Along with the coming of big data and the wide application of the Hadoop platform, data desensitization techniques for traditional relational database management systems cannot satisfy the need of production. A data desensitization system based on format-preserving encryption for Hadoop platform was implemented, which provided the encryption support for multiple data storage formats and data value types. Moreover,three different sorts of implementations were discussed, and a series of experiments were carried out to evaluate the performance.
出处
《电信科学》
北大核心
2017年第3期119-125,共7页
Telecommunications Science
关键词
大数据
数据脱敏
保形加密
系统
评估
big data
data desensitization
format-preserving encryption
system
evaluation