摘要
随着企业、政府以及私人等数据资产的不断增加,机器学习领域对于图像等分类应用需求也随之不断增涨.为了应对各种实际的需求,机器学习即服务(machine learning as a service,MLAAS)的云服务部署思想逐渐成为主流.然而,基于云服务实现的应用往往会带来严重的数据隐私安全问题.FPCBC(federated learning privacy-preserving classification system based on crowdsourcing aggregation)是一种基于众包聚合的联邦学习隐私保护分类系统.它将分类任务众包给多个边缘参与方并借助云计算来完成,不再使用联合训练理想模型的方式来得到可信度高的分类结果,而是让参与方先根据本地有限数据训练出的模型进行推理,然后再使用成熟的算法对推理结果聚合得到较高准确率的分类.重要的是,保证了数据查询方不会泄露任何隐私数据,很好地解决了传统MLAAS的隐私安全问题.在系统实现中,使用同态加密来对需要进行机器学习推理的图像数据加密;改善了一种众包的联邦学习分类算法,并通过引入双服务器机制来实现整个系统的隐私保护计算.通过实验和性能分析表明了该系统的可行性,且隐私保护的安全程度得到了显著提升,更适用于实际生活中对隐私安全需求较高的应用场景.
With the continuous increase of data assets from enterprises,governments and private individuals,the demand for classification applications such as images in the field of machine learning is also increasing.In order to meet various practical needs,the idea of cloud service deployment in machine learning as a service(MLAAS)has gradually become the mainstream.However,applications based on cloud services often bring serious data privacy and security leakage issues.FPCBC is a federated learning privacy-preserving classification system based on crowdsourcing aggregation.It crowdsources classification tasks to multiple edge participants and uses cloud computing to complete the whole process.However,instead of using the method of jointly training ideal models to obtain high-confidence classification results,we let the participants first train model based on limited local data and use the model to infer,and then we use mature algorithms to aggregate the inference results to obtain classification with higher accuracy.Importantly,users won’t leak any private data,which solves the privacy and security issues of traditional MLAAS.During the system implementation,we use homomorphic encryption to encrypt image data that requires machine learning inference;we also improve a crowdsourced federated learning classification algorithm,and implement the privacy-preserving computation of the entire system by introducing a dual-server mechanism.Experiments and performance analysis show that the system is feasible,the security degree of privacy protection has been significantly improved,and it is more suitable for application scenarios with high privacy and security requirements in real life.
作者
金歌
魏晓超
魏森茂
王皓
Jin Ge;Wei Xiaochao;Wei Senmao;Wang Hao(College of Information Science and Engineering,Shandong Normal University,Jinan 250358)
出处
《计算机研究与发展》
EI
CSCD
北大核心
2022年第11期2377-2394,共18页
Journal of Computer Research and Development
基金
中国博士后科学基金项目(2018M632712)
国家自然科学基金青年基金项目(61802235)
国家自然科学基金面上项目(62071280)。
关键词
联邦学习
众包
同态加密
隐私保护机器学习
分类
federated learning
crowdsourcing
homomorphic encryption
privacy-preserving machine learning
classification