摘要
数据已经成为与土地、劳动力、资本、技术等并列的重要生产要素之一.利用数据分析挖掘数据的潜在价值,有助于推动产业创新、技术升级和区域经济发展.然而,在数据使用过程中,隐私泄露等风险限制了数据的流通和共享.因此,如何在数据流通和共享过程中保护数据隐私已成为研究热点.联邦忘却学习(Federated Un-learning)撤销用户数据对联邦学习模型的训练更新,可以进一步保护联邦学习用户的数据安全.本文综述了联邦忘却学习的研究工作,首先简要阐述了联邦学习架构,并引出忘却学习和联邦忘却学习的概念和定义;其次,根据修正对象的不同将联邦忘却学习算法分为面向全局模型和面向局部模型两类,并详细分析各类算法的实现细节以及优缺点;然后,本文还详述联邦忘却学习中常用评价指标,将评价指标划分为模型表现指标、遗忘效果指标和隐私保护指标三类,并分析不同类型评价指标的优缺点;最后,本文对联邦忘却学习未来的研究方向进行展望.
Data has become an important factor of production alongside land,labor,capital,technology,etc.By leveraging data analysis to mine potential value,we can uncover profound insights into consumer behavior,market trends,and production efficiency,thereby promoting industrial innovation,technology upgrades,and regional economic development.However,it may cause privacy leakage problems when we use and share data.This oversight has also led to more serious issues,such as the leakage of sensitive data and illegal cross-border data transfers.For instance,some financial companies,due to the absence of comprehensive privacy protection mechanisms in the processes of collecting,circulating,and utilizing user data,have experienced incidents where data is used and traded without user consent.As a result,it severely stops the data circulation and sharing.To further protect user data privacy,federated unlearning can rollback the data-generated training updates to the machine learning model,which can further protect the data privacy and security of users.In this paper,we review the research work of federated unlearning.Firstly,we conduct an in-depth analysis of the federated learning training architecture,highlighting the specific types of privacy leakage threats.To reduce the risk of privacy leaks,we introduce the concept and definition of unlearning,and list different unlearning scenarios,thereby seamlessly transitioning to the concept of federated unlearning.On this basis,we outline the processes involved in federated unlearning and introduce unlearning granularity and challenges.Secondly,the federated unlearning algorithms are divided into two categories,including global model-oriented and local model-oriented algorithms according to the modified object.We further subdivide into several subcategories based on two major categories and analyze the implementation details of each algorithm in depth.To further compare the strengths and weaknesses,we conduct detailed comparative analyses across different categories of algorithms,focusing on aspects such as algorithm performance,types of requesters,and forgetting requests.Additionally,we also conducted an experiment to show the performance of different categories of federated unlearning algorithms in terms of model accuracy.Thirdly,the commonly used performance metrics are divided into three categories,including model performance metrics,forgetting effect metrics,and privacy protection metrics.We conduct a detailed comparison and analysis of these metrics in terms of the unlearning stage,as well as their advantages and drawbacks.Fourthly,we summarize the research and applications of federated unlearning in privacy protection and attack resistance,including the protection of commercial information privacy,federated recommendation systems and federated clustering,etc.Finally,this paper looks forward to the future research directions of unlearning algorithms and applications from the personalized perspective,including promoting the market circulation of data elements,deletion of low-quality data,forgetting applications in cross-domain machine learning,customized services,and federated unlearning in special scenarios.
作者
王鹏飞
魏宗正
周东生
宋威
肖蕴明
孙庚
于硕
张强
WANG Peng-Fei;WEI Zong-Zheng;ZHOU Dong-Sheng;SONG Wei;XIAO Yun-Ming;SUN Geng;YU Shuo;ZHANG Qiang(School of Computer Science and Technology,Dalian University of Technology,Dalian,Liaoning 116024;Key Laboratory of Social Computing and Cognitive Intelligence(Dalian University of Technology)Ministry of Education,Dalian,Liaoning 116024;Key Laboratory of Advanced Design and Intelligent Computing(Ministry of Education),Dalian University,Dalian,Liaoning 116024;Department of Computer Science,Northwestern University,Evanston 60208 USA;School of Computer Science and Technology,Jilin University,Changchun 130012;Key Laboratory of Symbolic Computing and Knowledge Engineering(Ministry of Education)Jilin University,Changchun 130012)
出处
《计算机学报》
EI
CSCD
北大核心
2024年第2期396-422,共27页
Chinese Journal of Computers
基金
国家重点研发计划(2021ZD0112400)
国家自然科学基金联合基金项目(U1908214)
国家自然科学基金青年项目(62202080)
中国博士后科学基金面上项目(2023M733354)
中央高校基本科研业务费(DUT23YG122)资助.
关键词
联邦学习
联邦忘却学习
数字经济
隐私保护
边缘智能
federated learning
federated unlearning
digital economy
privacy preserving
edge intelligence