摘要
在现实生活中,不同的行业之间,甚至同行业不同部门之间的数据并不互通,随着计算机算力的提升,制约模型训练效果的不是算力而是数据量。因此,想要得到更好的算法模型,仅靠某一方的数据是不够的,需要两方或者多方的参与,这就要求对各方的数据进行隐私保护。除此之外,随着收集的数据越来越详细,数据的维数也越来越大。面对高维的数据,数据降维是不可缺少的环节,而在数据降维方面,主成分分析(Principal Component Analysis,PCA)是常用的手段。当拥有数据的两方想要合作进行隐私保护的数据降维时,同态加密技术是一种解决办法。同态加密技术可以在保护数据隐私的前提下对加密数据进行计算,可以用在加密数据的PCA上。针对上述应用场景,利用CKKS同态加密方案,通过幂法迭代的SVD技术设计了一种两方加密数据进行PCA的方案,在保护两方数据隐私的前提下实现数据降维的目的;通过改进传统幂法迭代步骤,避免了代价高昂的同态密文除法运算,使得在选取较小的加密参数时,也能支持更多的幂法迭代次数,从而在缩短同态计算时间的同时提高计算精度。在公共数据集上进行测试,并与现有方案进行对比,该方案在计算耗时上缩短了约80%,与明文计算结果的均方误差缩减到1%以内。
In real life,data is not interconnected between different industries,or even between different departments within the same industry.With the improvement of computer computing power,it is not computing power but data volume that restricts the effectiveness of model training.Therefore,in order to obtain a better algorithm model,relying solely on one party’s data is not enough.It needs the participation of two or more parties,which requires privacy protection for all parties.In addition,as data collection becomes more detailed,the data dimension also increases.For high dimension data,dimension reduction is an indispensable step.And in terms of dimension reduction,principal component analysis(PCA)is a commonly used method.Homomorphic encryption is a solution when two parties want to collaborate on privacy protection data dimension reduction.Homomorphic encryption can compute encrypted data while protecting data privacy,and can be used to compute the PCA on encrypted data.In this paper,a two party encrypted data PCA scheme is designed using the CKKS homomorphic encryption scheme and the power method for dominant eigenvectors,achieving the goal of dimension reduction while protecting the privacy of both parties’data.By improving the traditional power method iteration steps,the expensive homomorphic ciphertext division is avoided,allowing for more iterations with small encryption parameters,thereby reducing the computing time and improving the accuracy of the computed results.Through testing on public datasets and comparing it with some existing schemes,the scheme reduces the computational time by about 80%,and reduces the mean squared error to within 1%compared to the plaintext computation results.
作者
张金斗
陈经纬
吴文渊
冯勇
ZHANG Jindou;CHEN Jingwei;WU Wenyuan;FENG Yong(Chongqing Institute of Green and Intelligent Technology,Chinese Academy of Sciences,Chongqing 400714,China;Chongqing School,University of Chinese Academy of Sciences,Chongqing 400714,China)
出处
《计算机科学》
CSCD
北大核心
2024年第8期387-395,共9页
Computer Science
基金
科技部重点研发计划(2020YFA0712300)
重庆市自然科学基金(cstc2021jcyj-msxmX0821,CSTB2023NSCQ-MSX0441,cstc2021yszx-jcyjX0004,2022YSZX-JCX0011CSTB,CSTB2023YSZX-JCX0008)
中国科学院西部青年学者项目。
关键词
同态加密
隐私保护
主成分分析
奇异值分解
幂法
Homomorphic encryption
Privacy preserving
Principal component analysis
Singular value decomposition
Power method