期刊文献+

基于粒子群优化的面向数据异构的联邦学习方法

Particle Swarm Optimization-based Federated Learning Method for Heterogeneous Data
下载PDF
导出
摘要 联邦学习是一种新兴的面向隐私保护的分布式机器学习框架,其核心特点是能够在不获取客户端原始数据的条件下实现分布式机器学习。客户端利用本地数据进行模型训练,然后将模型参数上传至服务端进行聚合,从而确保客户端数据始终得到保护。在此过程中,存在频繁的参数传输导致的通信成本高昂问题和各客户端所拥有的非独立同分布异构数据问题,两者严重制约了联邦学习的应用。针对上述问题,提出了一种基于粒子群优化的面向数据异构的联邦学习方法——FedPSG,将客户端传输到服务器的数据形式由模型参数转变为模型分值,在每轮训练中只需要少部分客户端向服务器上传模型参数,从而降低通信成本;同时,提出了一种模型再训练策略,使用服务器数据对全局模型进行二次迭代训练,通过缓解数据异构问题对联邦学习的影响来进一步提升模型性能。模拟不同的数据异构环境,在MNIST,FashionMNIST与CIFAR-10数据集上进行实验,结果表明FedPSG能够有效提高模型在不同数据异构环境下的准确率,并且验证了模型再训练策略能有效解决客户端数据异构问题。 Federated learning is an emerging privacy-preserving distributed machine learning framework,whose core feature is the ability to implement distributed machine learning without access to the client’s raw data.The client uses local data for model training and then uploads the model parameters to the server for aggregation,thus ensuring that the client data is always protected.In this process,there are problems of high communication costs due to frequent parameter transfers and non-independent homogeneous heterogeneous data owned by each client,both of which severely limit the application of federated learning.To address these problems,FedPSG,a federated learning method based on particle swarm optimization for data heterogeneity,is proposed to reduce the communication cost by changing the form of data transferred from the client to the server from model para-meters to model scores,so that only a small number of clients need to upload model parameters to the server in each training round.Meanwhile,a model retraining strategy is proposed to use the server data to train the global model for a second iteration,further improving the model performance by mitigating the impact of data heterogeneity issues on federated learning.Simulating different data heterogeneous environments,experiments are conducted on MNIST,FashionMNIST and CIFAR-10 datasets.The results show that FedPSG can effectively improve the accuracy of the model in different data heterogeneous environments,and verify that the model retraining strategy can effectively solve the client-side data heterogeneity problem.
作者 徐奕成 戴超凡 马武彬 吴亚辉 周浩浩 鲁晨阳 XU Yicheng;DAI Chaofan;MA Wubin;WU Yahui;ZHOU Haohao;LU Chenyang(National Key Laboratory of Information Systems Engineering,National University of Defense Technology,Changsha 410073,China)
出处 《计算机科学》 CSCD 北大核心 2024年第6期391-398,共8页 Computer Science
基金 国家自然科学基金面上项目(61871388)。
关键词 联邦学习 粒子群算法 通信成本 数据异构 隐私保护 Federated learning Particle swarm algorithm Communication cost Data heterogeneity Privacy protection
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部