摘要
联邦学习(FL)是一种分布式机器学习方法,旨在共同训练全局模型,然而全局模型难以胜任多数据分布情况。为应对多分布挑战,引入聚类联邦学习,以客户端分组方式优化共享多模型。其中,服务器端聚类难以修正分类错误,而客户端聚类则对初始模型的选择至关重要。为解决这些问题,提出自动调整聚类联邦学习(AACFL)框架,所提框架采用双端聚类整合服务器端和客户端聚类。首先用双端聚类将客户端分为可调整集群,其次自动调整局部客户端身份,最后获取正确的客户集群。在非独立同分布下,在3个经典联邦数据集上的评估实验结果表明,AACFL能够在双端聚类结果存在错误的情况下通过调整获得正确集群,当簇数为4,客户端数为100时,与联邦平均(FedAvg)算法、聚类联邦学习(CFL)和IFCA(Iterative Federated Clustering Algorithm)等方法相比,有效地提高模型收敛速度和获得正确聚类结果的速度,准确率平均提升0.20~23.16个百分点。验证了所提框架能够高效聚类,并提高模型收敛速度和准确率。
Federated Learning(FL)is a distributed machine learning method that aims to jointly train a global model,but the global model is difficult to handle multi-data distribution situations.To deal with the multi-distribution challenge,clustered federated learning was introduced to optimize shared multiple models in a client grouping manner.Among them,server-side clustering was difficult to correct classification errors,while client-side clustering was crucial to the selection of the initial model.To solve these problems,an Automatically Adjusted Clustered Federated Learning(AACFL)framework was proposed,which used double-ended clustering to integrate server-side and client-side clustering.Firstly,double-ended clustering was used to divide client ends into adjustable clusters.Then,local client end identities were adjusted automatically.Finally,the correct client clusters were obtained.AACFL was evaluated on three classical federated datasets under non-independent and identically distributed conditions.Experimental results show that AACFL can obtain correct clusters through adjustment when there are errors in the double-ended clustering results.Compared with FedAvg(Federated Averaging)algorithm,CFL(Clustered Federated Learning),IFCA(Iterative Federated Clustering Algorithm)and other methods,AACFL can effectively improve the model convergence speed and the speed of obtaining correct clustering results,and has the accuracy improved by 0.20-23.16 percentage points on average with the number of clusters is 4 and the number of clients is 100.Therefore,the proposed framework can cluster efficiently and improve model convergence speed and accuracy.
作者
尹春勇
周永成
YIN Chunyong;ZHOU Yongcheng(School of Computer Science,School of Cyberspace Security,Nanjing University of Information Science and Technology,Nanjing Jiangsu 210044,China;School of Software,Nanjing University of Information Science and Technology,Nanjing Jiangsu 210044,China)
出处
《计算机应用》
CSCD
北大核心
2024年第10期3011-3020,共10页
journal of Computer Applications
基金
国家自然科学基金资助项目(6177282)。
关键词
联邦学习
聚类
异构数据
分布式机器学习
神经网络
Federated Learning(FL)
clustering
heterogeneous data
distributed machine learning
neural network