面向非独立同分布数据的联邦学习数据增强方案被引量：1

Data augmentation scheme for federated learning with non-IID data

下载PDF

导出

摘要为了解决联邦学习节点间数据非独立同分布(non-IID)导致的模型精度不理想的问题,提出一种隐私保护的数据增强方案。首先,提出了面向联邦学习的数据增强框架,参与节点在本地生成虚拟样本并在节点间共享,有效缓解了训练过程中数据分布差异导致的模型偏移问题。其次,基于生成式对抗网络和差分隐私技术,设计了隐私保护的样本生成算法,在保证原数据隐私的前提下生成可用的虚拟样本。最后,提出了隐私保护的标签选取算法,保证虚拟样本的标签同样满足差分隐私。仿真结果表明,在多种non-IID数据划分策略下,所提方案均能有效提高模型精度并加快模型收敛,与基准方法相比,所提方案在极端non-IID场景下能取得25%以上的精度提升。 To solve the problem that the model accuracy remains low when the data are not independent and identically distributed(non-IID) across different clients in federated learning, a privacy-preserving data augmentation scheme was proposed. Firstly, a data augmentation framework for federated learning scenarios was designed. All clients generated synthetic samples locally and shared them with each other, which eased the problem of client drift caused by the difference of clients’ data distributions. Secondly, based on generative adversarial network and differential privacy, a private sample generation algorithm was proposed. It helped clients to generate informative samples while preserving the privacy of clients’ local data. Finally, a differentially private label selection algorithm was proposed to ensure the labels of synthetic samples will not leak information. Simulation results demonstrate that under multiple non-IID data partition strategies, the proposed scheme can consistently improve the model accuracy and make the model converge faster. Compared with the benchmark approaches, the proposed scheme can achieve at least 25% accuracy improvement when each client has only one class of samples.

作者汤凌韬王迪刘盛云 TANG Lingtao;WANG Di;LIU Shengyun(State Key Laboratory of Mathematical Engineering and Advanced Computing,Wuxi 214125,China;School of Cyber Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China)

机构地区数学工程与先进计算国家重点实验室上海交通大学网络空间安全学院

出处《通信学报》 EI CSCD 北大核心 2023年第1期164-176,共13页 Journal on Communications

基金国家重点研发计划基金资助项目(No.2016YFB1000500) 国家科技重大专项基金资助项目(No.2018ZX01028102)。

关键词联邦学习非独立同分布生成式对抗网络差分隐私数据增强 federated learning non-IID generative adversarial network differential privacy data augmentation

分类号 TP301 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

同被引文献13

1宋鹏飞,许伟,赵随海,周宏伟.基于虚拟化技术的铁路调度集中系统体系架构研究[J].铁道运输与经济,2018,40(10):61-65. 被引量：7
2王振东,齐威,苗义烽,苗长俊.基于云计算技术的铁路调度集中系统架构设计研究[J].铁道运输与经济,2020,42(1):38-43. 被引量：25
3李逸楷,张通,陈俊龙.面向边缘计算应用的宽度孪生网络[J].自动化学报,2020,46(10):2060-2071. 被引量：11
4赵宏涛,陈峰,许伟,曹桢,白利洁.基于云边协同的高速铁路智能行车调度系统研究[J].铁道运输与经济,2021,43(1):71-76. 被引量：13
5宋鹏飞,赵宏涛,王涛,周晓昭.多系统数据融合的智能调度集中系统方案研究[J].铁道标准设计,2021,65(5):168-172. 被引量：12
6田锐,赵飞.高速铁路智能调度系统功能架构及关键技术探讨[J].铁道运输与经济,2022,44(5):52-56. 被引量：5
7苗义烽,齐威,傅钟晖,王振东.京张高铁智能调度集中系统示范应用[J].铁道通信信号,2022,58(6):1-6. 被引量：7
8梁天恺,曾碧,陈光.联邦学习综述:概念、技术、应用与挑战[J].计算机应用,2022,42(12):3651-3662. 被引量：22
9汤凌韬,陈左宁,张鲁飞,吴东.联邦学习中的隐私问题研究进展[J].软件学报,2023,34(1):197-229. 被引量：10
10李军,赵世超,王斌,李少峰.面向铁路运输生产全过程的智能综合调度系统方案研究[J].铁道运输与经济,2023,45(1):23-29. 被引量：6

引证文献1

1赵宏涛.面向数据隐私的高速铁路智能调度大数据运用方案研究[J].铁道运输与经济,2024,46(6):81-86.

1任工昌,张路平,刘朋,桓源.非结构环境空置域识别与放置点选取算法研究[J].机床与液压,2022,50(23):82-87.
2韦泰丞,刘雁兵,陈浩,赵弘胤.基于加权分类和样本合成的卷烟图像精细识别[J].计算机与网络,2022,48(23):65-72.

通信学报

2023年第1期

浏览历史

内容加载中请稍等...

面向非独立同分布数据的联邦学习数据增强方案被引量：1

同被引文献13

引证文献1

相关作者

相关机构

相关主题

浏览历史

面向非独立同分布数据的联邦学习数据增强方案 被引量：1

同被引文献13

引证文献1

相关作者

相关机构

相关主题

浏览历史

面向非独立同分布数据的联邦学习数据增强方案被引量：1