基于相似度聚类和正则化的个性化联邦学习

Personalized federated learning based on similarity clustering and regularization

下载PDF

导出

摘要联邦学习(FL)应用场景中,常面临客户端数据异质性和不同任务需求需要提供个性化模型的问题,但现有的部分个性化联邦学习(PFL)算法中存在个性化与全局泛化的权衡问题,并且这些算法大多采用传统FL中根据客户端数据量加权聚合的方法,导致数据分布差异大的客户端模型性能变差,缺乏个性化聚合策略。针对上述问题,提出一种基于相似度聚类和正则化的PFL算法pFedSCR。pFedSCR算法在客户端本地更新阶段训练个性化模型和局部模型,其中:个性化模型在交叉熵损失函数中引入L2范数正则化,动态调整参考全局模型的程度,在汲取全局知识的基础上实现个性化;在服务端聚合阶段,根据客户端模型更新的相似度聚类,构建聚合权重矩阵,动态调整聚合权重,为不同客户端聚合个性化模型,让参数聚合策略具有个性化的同时解决数据异构问题。在CIFAR-10、MNIST、Fashion-MNIST 3个数据集上通过狄利克雷(Dirichlet)分布模拟了多种非独立同分布(Non-IID)数据场景,结果表明:pFedSCR算法在各种场景下的准确度和通信效率都优于经典算法FedProx和最新个性化算法FedPCL(Federated Prototype-wise Contrastive Learning)等联邦学习算法,最高可达到99.03%准确度。 In Federated Learning(FL)application scenarios,the problems of data heterogeneity and the need to provide personalized models for different task requirements are often faced.However,the trade-off between personalization and global generalization exists in some existing Personalized Federated Learning(PFL)algorithms,and most of these algorithms use the weighted aggregation based on the amount of client data in traditional FL method,which causes poor model performance for clients with significant differences in data distribution and a lack of personalized aggregation strategies.In response to the above problems,a new PFL algorithm based on similarity clustering and regularization,namely pFedSCR,was proposed.The pFedSCR algorithm trains personalized models and local models in the client local update phase,in which the L2 norm regularization was introduced into the cross entropy loss function by the personalized models to dynamically adjust the degree of reference to the global model,thereby achieving personalization based on learning global knowledge;in the server aggregation phase,an aggregation weight matrix was constructed based on the similarity clustering updated by the client models,and the aggregation weights were dynamically adjusted to aggregate personalized models for different clients,so as to make the parameter aggregation strategy personalized while solving the problem of data heterogeneity at the same time.Experimental results under multiple Non-Independent Identical Distribution(Non-IID)data scenarios simulated through Dirichlet distribution on three datasets such as CIFAR-10,MNIST and Fashion-MNIST show that compared with some FL algorithms including the classic algorithm FedProx and the latest personalized algorithm FedPCL(Federated Prototype-wise Contrastive Learning),the pFedSCR algorithm has higher precision and communication efficiency in various scenarios,and can obtain 99.03%accuracy at most.

作者巫婕钱雪忠宋威 WU Jie;QIAN Xuezhong;SONG Wei(School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi Jiangsu 214122,China)

机构地区江南大学人工智能与计算机学院

出处《计算机应用》 CSCD 北大核心 2024年第11期3345-3353,共9页 journal of Computer Applications

基金国家自然科学基金资助项目(62076110)。

关键词联邦学习非独立同分布余弦相似度正则化个性化联邦学习隐私安全 Federated Learning(FL) Non-Independent Identical Distribution(Non-IID) cosine similarity regularization Personalized Federated Learning(PFL) privacy security

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1金春花,李路路,王佳浩,季玲,刘欣颖,陈礼青,张浩,翁健.面向多域数据场景的安全高效联邦学习[J].模式识别与人工智能,2024,37(9):824-838.
2李盎.生成式人工智能对新闻传播教育的影响和对策[J].陕西教育（高教版）,2024(10):32-34.
3邹徐熹,周忠冉,王虹岚,李飞,顾亚林,魏训虎,李静.基于联邦学习的分布式物联网设备识别方法[J].计算机工程与应用,2024,60(23):155-167.
4孙艳华,王子航,刘畅,杨睿哲,李萌,王朱伟.个性化联邦学习的相关方法与展望[J].计算机工程与应用,2024,60(20):68-83.
5孙晶.分类数据的Word2Vec与Jaccard相似度聚类方法的比较分析[J].软件,2024,45(9):49-51.
6左乐,张琪.基于图神经网络预测药物–靶标相互作用的方法综述[J].生物医学,2024,14(4):563-572.
7武建,郑怡莉,马士峰.浅谈人工智能算法在水利行业中应用[J].水利技术监督,2024(9):34-36.
8罗凯鸿,徐茹枝,夏迪娅,杨鑫.基于匿名性差分隐私联邦学习的负荷预测模型训练方法[J].电力信息与通信技术,2024,22(11):25-33.
9刘淼,林婉茹,王琴,桂冠.车联网联邦学习的数据异质性问题及基于个性化的解决方法综述[J].通信学报,2024,45(10):207-224.
10禄小敏,马犇,闫浩文,李蓬勃.基于骨架线的地图建筑物形状分类方法[J].浙江大学学报（工学版）,2024,58(12):2479-2488.

计算机应用

2024年第11期

浏览历史

内容加载中请稍等...

基于相似度聚类和正则化的个性化联邦学习

相关作者

相关机构

相关主题

浏览历史