To leverage the enormous amount of unlabeled data on distributed edge devices,we formulate a new problem in federated learning called federated unsupervised representation learning(FURL)to learn a common representatio...To leverage the enormous amount of unlabeled data on distributed edge devices,we formulate a new problem in federated learning called federated unsupervised representation learning(FURL)to learn a common representation model without supervision while preserving data privacy.FURL poses two new challenges:(1)data distribution shift(non-independent and identically distributed,non-IID)among clients would make local models focus on different categories,leading to the inconsistency of representation spaces;(2)without unified information among the clients in FURL,the representations across clients would be misaligned.To address these challenges,we propose the federated contrastive averaging with dictionary and alignment(FedCA)algorithm.FedCA is composed of two key modules:a dictionary module to aggregate the representations of samples from each client which can be shared with all clients for consistency of representation space and an alignment module to align the representation of each client on a base model trained on public data.We adopt the contrastive approach for local model training.Through extensive experiments with three evaluation protocols in IID and non-IID settings,we demonstrate that FedCA outperforms all baselines with significant margins.展开更多
基金supported by the National Key Research&Development Project of China(Nos.2021ZD0110700 and 2021ZD0110400)the National Natural Science Foundation of China(Nos.U20A20387,U19B2043,61976185,and U19B2042)+2 种基金the Zhejiang Natural Science Foundation,China(No.LR19F020002)the Zhejiang Innovation Foundation,China(No.2019R52002)the Fundamental Research Funds for the Central Universities,China。
文摘To leverage the enormous amount of unlabeled data on distributed edge devices,we formulate a new problem in federated learning called federated unsupervised representation learning(FURL)to learn a common representation model without supervision while preserving data privacy.FURL poses two new challenges:(1)data distribution shift(non-independent and identically distributed,non-IID)among clients would make local models focus on different categories,leading to the inconsistency of representation spaces;(2)without unified information among the clients in FURL,the representations across clients would be misaligned.To address these challenges,we propose the federated contrastive averaging with dictionary and alignment(FedCA)algorithm.FedCA is composed of two key modules:a dictionary module to aggregate the representations of samples from each client which can be shared with all clients for consistency of representation space and an alignment module to align the representation of each client on a base model trained on public data.We adopt the contrastive approach for local model training.Through extensive experiments with three evaluation protocols in IID and non-IID settings,we demonstrate that FedCA outperforms all baselines with significant margins.