期刊文献+

隐私保护线性回归方案与应用

Privacy-preserving Linear Regression Scheme and Its Application
下载PDF
导出
摘要 线性回归是一种基础且应用广泛的机器学习算法,线性回归模型的训练通常依赖于大量的数据,而现实中数据集一般由不同的用户持有且包含用户的隐私信息,当多个用户想要集中大量的数据训练效果更好的模型时,会不可避免地涉及用户的隐私问题。同态加密作为一种隐私保护技术,可以有效解决计算中的隐私泄露问题。针对数据集水平分布在两个用户上的场景,结合CKKS同态加密技术,设计了一种新的基于混合迭代方法的隐私保护线性回归方案。该方案分为两个阶段:第一阶段实现了密文域上的随机梯度下降算法;第二阶段设计了一种安全两方快速下降协议,该协议的核心思想基于雅可比迭代算法,可以有效弥补实际应用中梯度下降法收敛效果不佳的缺陷,加速了模型的收敛,从而降低了方案的计算代价和通信损耗,在高效训练线性回归模型的同时保护了两个用户的数据隐私。分析了方案的效率、通信损耗以及安全性,利用C++实现了该方案并将其应用于真实数据集。大量实验结果表明,该方案可以高效地解决特征规模较大的线性回归问题,可决系数的相对误差小于0.001,这表明得到的隐私保护线性回归模型在真实数据集上的应用效果接近于直接在明文数据上求得的模型,可以满足特定场景下的实际应用需求。 Linear regression is an important and widely used machine learning algorithm.The training of linear regression model usually depends on a large amount of data.In reality, the data set is generally held by different users and contains their privacy information.When multiple users want to gather more data to train a better model, it inevitably involves users’ privacy.As a privacy protection technology, homomorphic encryption can effectively solve the problem of privacy leakage in computing.A new privacy preserving linear regression scheme based on hybrid iterative method is designed for the scenario where data sets are distri-buted horizontally on two users.The scheme is divided into two stages.The first stage implements the statistic gradient descent algorithm in the ciphertext domain.In the second stage, a secure two-party fast descent protocol is designed.The core idea of the protocol is based on Jacobi iterative method, which can effectively make up for the poor convergence effect of gradient descent method in practical application, accelerate the convergence of the model, and protect the data privacy of two users while effectively training the linear regression model.The efficiency, communication loss and security of the scheme are analyzed.The scheme is implemented by using C++and applied to real data sets.A large number of experimental results show that the scheme can effectively solve the linear regression problem with large scale features.The relative error of decision coefficient is less than 0.001,which show that the application effect of the privacy preserving linear regression model in real data set is close to that obtained directly from unencrypted data, and the scheme can meet the practical application requirements in specific scenarios.
作者 吕由 吴文渊 LYU You;WU Wen-yuan(Chongqing Institute of Green and Intelligent Technology,Chinese Academy of Sciences,Chongqing 400714,China;University of Chinese Academy Sciences,Beijing 100049,China)
出处 《计算机科学》 CSCD 北大核心 2022年第9期318-325,共8页 Computer Science
基金 科技部重点研发项目(2020YFA0712303) 贵州省科技计划项目([2020]4Y056) 重庆市在渝院士牵头科技创新引导专项(cstc2020yszx-jcyjX0005)。
关键词 隐私保护 线性回归 混合迭代方法 同态加密 Privacy-preserving Linear regression Hybrid iterative method Homomorphic encryption
  • 相关文献

参考文献1

共引文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部