摘要
大数据环境下,同态加密可以有效解决机器学习中的隐私泄露问题.本文利用CKKS同态加密技术,设计了一种两方参与、基于密文域上带除法延迟的改进共轭梯度法的隐私保护岭回归方案,参与模型训练的双方可以通过少量的交互,在密文数据上高效地训练岭回归模型,防止过程中有隐私被泄露;分析了方案的安全性、计算以及通信复杂度,基于HEAAN同态加密库利用C++实现了该方案;在公开数据集上验证了该方案,实验证明所提方案可以安全高效地训练岭回归模型.对于特征维度为77,样本个数为4000的UCI数据集Twitter,训练模型所需迭代次数仅为16,时间损耗为127.5 s,通信量为41.87MB,密文域上带除法延迟的改进共轭梯度法收敛速度快,模型训练效率高,通信损耗小,且在密文数据上的训练得到的模型参数与在明文数据上的计算结果相比误差不超过0.001,可以满足特定场景下的实际应用需求.
In the big data era,homomorphic encryption can effectively solve the problem of privacy disclosure in machine learning.Using CKKS homomorphic encryption,this paper designs a privacypreserving ridge regression scheme based on the improved conjugate gradient algorithm with division delay in ciphertext domain.The two sides involved in model training can efficiently train the ridge regression model on ciphertext data through a small amount of interaction to prevent privacy disclosure in the process.The security,computational complexity and communication cost of the scheme are analyzed.The scheme is implemented using C++based on HEAAN library.Finally,the scheme is verified on public data sets.Experiments show that the scheme proposed in this paper can train ridge regression model safely and efficiently.For UCI dataset Twitter with 77 feature dimensions and 4000 samples,the number of iterations required by conjugate gradient method is only 16,the time cost of the whole scheme is 127.5 s,and the communication cost is 41.87 MB.Because of the improved conjugate gradient method with fast convergence on encrypted data,the model has high training efficiency and low communication cost,and the error of model parameters trained on ciphertext data is no more than 0.001 compared with the results on plaintext data.This scheme can meet the practical application requirements in specific scenarios.
作者
吕由
吴文渊
LYU You;WU Wen-Yuan(Chongqing Institute of Green and Intelligent Technology,Chinese Academy of Sciences,Chongqing 400714,China;University of Chinese Academy Sciences,Beijing 100049,China)
出处
《密码学报》
CSCD
2023年第2期276-288,共13页
Journal of Cryptologic Research
基金
国家重点研发计划(2020YFA0712303)
重庆市在渝院士牵头科技创新引导专项(cstc2020yszx-jcyjX0005)
贵州省科技计划([2020]4Y056)。
关键词
隐私保护
岭回归
同态加密
共轭梯度法
HEAAN
privacy preserving
ridge regression
homomorphic encryption
conjugate gradient method
HEAAN