期刊文献+

基于强化学习的可持续联邦学习激励机制设计

Incentive Mechanism Design for Sustainable Federated Learning Based on Reinforcement Learning
下载PDF
导出
摘要 随着数据在互联网、物联网和人工智能等技术中的广泛应用,数据共享成为促进经济和科技发展的关键引擎之一。然而,由于数据隐私和法律等多方面的顾虑,数据共享面临挑战。联邦学习作为一种新兴的机器学习范式,以保护数据隐私的同时促进多方协作而备受关注。本文关注跨孤岛的长期联邦学习合作,旨在解决数据所有者参与合作的成本和风险问题。本文首先建立了动态博弈模型,考虑了联邦客户端之间的互动策略;然后,提出了一种基于强化学习的激励机制,通过中央计划者为不同训练期设定激励,有效地促进客户端的参与。实验证明,该激励方案在提高系统总收益和控制激励成本方面具有显著效果。本文为可持续联邦学习提供了一种有效的激励设计,有望推动数据共享和合作模型在不同领域的应用。 As Internet , Internet of Things (IoT) , and Artificial Intelligence (AI) technologies rapidly evolve , data has become a critical driving force behind economic and technological advancement.Companies can leverage data analysis to gain comprehensive insights into customer behavior , market trends , and operational performance , thereby making informed decisions and enhancing overall performance.However , a single organization s data may not be sufficient for comprehensive data analysis , posing a significant challenge.For instance , developing an accurate marketing model to target users may necessitate data from multiple sources , such as telecom operators , social networking sites , and e-commerce platforms.This data scarcity necessitates data-sharing mechanisms , which are often fraught with concerns surround data privacy , ethics , and legality.In this regard , Federated Learning (FL)—a novel machine learning paradigm—has garnered increasing attention.FL participants can train local models , safeguard data privacy , and exchange only model parameters with servers or other peers , fully capitalizing on the value of data.This “ data-available-but-not-visible ” approach is gaining popularity in data-intensive fields.Many FL tasks cannot be accomplished in a single instance and require sustained collaboration among multiple parties.For example , in the joint development of an FL model across multiple medical institutions to detect and manage chronic diseases , continuous accumulation of clinical data , learning from case changes , and model robustness and predictability improvements are necessary to reflect the latest medical knowledge and practices.Current literature on FL cooperative behavior and incentive mechanisms , however , primarily focuses on cross-device federated learning and considers only one-off cooperation.This modeling is inadequate for characterizing practical cross-silo long-term FL patterns.On the one hand , cross-silo FL participants , who also accumulate a certain amount of data , have more complex and diverse strategic options compared to those in cross-device FL.Participants can choose to participate in public training or solely improve their model utility through local training.On the other hand , when cooperation transitions from a one-off to a long-term scenario , time inconsistency issues may lead to free-riding behaviors , incentivizing participants to delay data contributions while enjoying the benefits of others contributions.To address these limitations , this study concentrates on the long-term cross-silo FL process , establishing a dynamic game model to characterize federated clients interactive strategies and proposing a reinforcement learning-based incentive mechanism to encourage rational participant contribution , aiming to boost the FL system s overall revenue.This paper first establishes a dynamic game model to characterize federated clients long-term interactive strategies.We devise a cooperation contract in which the central server only transmits the aggregated parameters to current training period contributors.With the long-term cross-silo FL cooperation process divided into several model training periods , clients have two strategic choices in each period : to participate in public federated training or to retain data for local training only.At the end of each period , clients receive feedback parameters from the central server and gain corresponding benefits based on their local models accuracy.In this framework , clients face a trade-off between participation costs and potential early contribution benefits.Given the information accumulation in the model with the client s input , clients also confront a cross-period decision-making problem regarding resource allocation throughout the entire long-term FL cooperation process.Based on these background assumptions , this paper establishes a game tree to consider the game solution , where clients decisions in each training period are based on full knowledge of past cooperation and rational expectations of future actions.Through backward induction , we solve for the client s equilibrium strategy , which exhibits intermittent contribution gaps , clearly deviating from the socially optimal cooperative pattern.Building on the above game analysis , this paper subsequently designs a dynamic incentive scheme based on reinforcement learning , setting incentives f or different training periods based on clients cooperation progress.Firstly , we regard the FL organization as a central planner responsible for issuing incentives before each training period to encourage federated client input.The Deep Reinforcement Learning (DRL) agent assists the central planner in making incentive decisions , with federated clients serving as the environment with which the agent interacts.On the one hand , we meticulously design the state , action , and reward of the DRL method to fully encompass the information of the federated learning cooperation process.On the other hand , we introduce enhancements to the traditional Deep Q-Network (DQN) method , such as Double Deep Q-Network (DDQN) , prioritized replay , and noisy network , to augment the method s performance.Through extensive experiments , we verify the scheme s effectiveness in improving the system s total revenue and controlling incentive costs.Reasonable incentive cost penalties can guide the DRL agent towards the mo st cost-effective incentive scheme , accurately incentivizing low-willingness cooperation periods of clients , and the system revenue under the same budget significantly surpasses that of fixed incentives.This paper not only theoretically uncovers the dynamic patterns in long-term cross-silo federated learning cooperation but also proposes innovative incentive mechanisms to enhance cooperation efficiency , offering fresh insights and methodologies for effectively facilitating data sharing and cooperation in the contemporary information era.
作者 艾秋媛 詹志坚 王聪 宋洁 Qiuyuan Ai;Zhijian Zhan;Cong Wang;Jie Song(College of Engineering,Peking University;Academy for Advanced Interdisciplinary Studies,Peking University;Guanghua School of Management,Peking University)
出处 《经济管理学刊》 2024年第1期115-144,共30页 Quarterly Journal of Economics and Management
基金 国家自然科学基金重点项目(72131001) 国家自然科学基金青年项目(72101007) 国家自然科学基金专项项目(72241420)对本文研究的资助。
关键词 数据共享 联邦学习 激励机制 稳定合作 Data Sharing Federated Learning Incentive Mechanism Stable Cooperation
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部