摘要
联邦学习(FL)应用场景中,常面临客户端数据异质性和不同任务需求需要提供个性化模型的问题,但现有的部分个性化联邦学习(PFL)算法中存在个性化与全局泛化的权衡问题,并且这些算法大多采用传统FL中根据客户端数据量加权聚合的方法,导致数据分布差异大的客户端模型性能变差,缺乏个性化聚合策略。针对上述问题,提出一种基于相似度聚类和正则化的PFL算法pFedSCR。pFedSCR算法在客户端本地更新阶段训练个性化模型和局部模型,其中:个性化模型在交叉熵损失函数中引入L2范数正则化,动态调整参考全局模型的程度,在汲取全局知识的基础上实现个性化;在服务端聚合阶段,根据客户端模型更新的相似度聚类,构建聚合权重矩阵,动态调整聚合权重,为不同客户端聚合个性化模型,让参数聚合策略具有个性化的同时解决数据异构问题。在CIFAR-10、MNIST、Fashion-MNIST 3个数据集上通过狄利克雷(Dirichlet)分布模拟了多种非独立同分布(Non-IID)数据场景,结果表明:pFedSCR算法在各种场景下的准确度和通信效率都优于经典算法FedProx和最新个性化算法FedPCL(Federated Prototype-wise Contrastive Learning)等联邦学习算法,最高可达到99.03%准确度。
In Federated Learning(FL)application scenarios,the problems of data heterogeneity and the need to provide personalized models for different task requirements are often faced.However,the trade-off between personalization and global generalization exists in some existing Personalized Federated Learning(PFL)algorithms,and most of these algorithms use the weighted aggregation based on the amount of client data in traditional FL method,which causes poor model performance for clients with significant differences in data distribution and a lack of personalized aggregation strategies.In response to the above problems,a new PFL algorithm based on similarity clustering and regularization,namely pFedSCR,was proposed.The pFedSCR algorithm trains personalized models and local models in the client local update phase,in which the L2 norm regularization was introduced into the cross entropy loss function by the personalized models to dynamically adjust the degree of reference to the global model,thereby achieving personalization based on learning global knowledge;in the server aggregation phase,an aggregation weight matrix was constructed based on the similarity clustering updated by the client models,and the aggregation weights were dynamically adjusted to aggregate personalized models for different clients,so as to make the parameter aggregation strategy personalized while solving the problem of data heterogeneity at the same time.Experimental results under multiple Non-Independent Identical Distribution(Non-IID)data scenarios simulated through Dirichlet distribution on three datasets such as CIFAR-10,MNIST and Fashion-MNIST show that compared with some FL algorithms including the classic algorithm FedProx and the latest personalized algorithm FedPCL(Federated Prototype-wise Contrastive Learning),the pFedSCR algorithm has higher precision and communication efficiency in various scenarios,and can obtain 99.03%accuracy at most.
作者
巫婕
钱雪忠
宋威
WU Jie;QIAN Xuezhong;SONG Wei(School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi Jiangsu 214122,China)
出处
《计算机应用》
CSCD
北大核心
2024年第11期3345-3353,共9页
journal of Computer Applications
基金
国家自然科学基金资助项目(62076110)。
关键词
联邦学习
非独立同分布
余弦相似度
正则化
个性化联邦学习
隐私安全
Federated Learning(FL)
Non-Independent Identical Distribution(Non-IID)
cosine similarity
regularization
Personalized Federated Learning(PFL)
privacy security