摘要
联邦学习是一种新兴的用于隐私保护的分布式机器学习框架.然而,在联邦学习模式下训练的模型通常比在标准集中式学习模式下训练的模型性能差,特别是在训练数据类不平衡的情况下.为了在保护客户隐私的同时解决联邦学习中的类别不平衡问题,提出了一种采用同态加密的整体数据分布评估方法.针对联邦学习中极端类不平衡问题,在评估得到的整体数据分布之上,每个客户对本地数据通过过采样和欠采样结合的方式进行样本采样.实验结果表明,本文提出的方法在不泄露客户隐私的前提下,提高了类不平衡条件下联邦学习模型的收敛速度和分类性能.
Federated learning is an emerging distributed machine learning framework for privacy preservation.However,models trained in federated learning usually have worse performance than those trained in the standard centralized learning mode,especially when the training data are class imbalanced.In order to solve the class imbalance problem in federated learning while protecting client privacy,a global data distribution estimation method using homomorphic encryption is proposed.For extreme class imbalance problem in federated learning,each client samples local data in a combination of over-sampling and under-sampling over the global data distribution estimated.The experimental results show that the method proposed in this paper improves the convergence speed and classification performance of the federated learning model under the condition of class imbalance without disclosing client privacy.
作者
张晶
李传文
ZHANG Jing;LI Chuanwen(School of Computer Science and Engineering,Northeastern University,Shenyang 110169,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2024年第7期1592-1598,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61872071)资助。
关键词
隐私保护
极端类别不平衡
类不平衡
联邦学习
privacy protection
extreme class imbalance
class imbalance
federated learning