摘要
联邦学习作为一种分布式机器学习框架,可以在不泄露用户数据的前提下完成模型训练.然而,最近的攻击表明,在训练过程中仅仅保持数据的局部性并不能提供足够的隐私保障.因此,为了解决联邦学习训练过程中存在的隐私保护问题,提出了一种基于BERT的文本分类模型,该模型将差异隐私(DP)和联邦学习(FL)相结合,在联邦学习参数的传递过程中保证联邦模型训练过程免受推理攻击的影响.最终实验表明,提出的方法在能够保护隐私的同时仍可保证较高的模型准确率.
As a distributed machine learning framework,federated learning can complete model training without disclosing user data.However,recent attacks have shown that only keeping the locality of data in the training process can not provide sufficient privacy protection.Therefore,in order to address the privacy protection issues during federated learning training,this paper proposes a text classification model based on BERT.This model combines differential privacy(DP)and federated learning(FL)to ensure that the federated model training process is protected from inference attacks during the transfer of federated learning parameters.The final experiment shows that the proposed method can maintain high model accuracy while protecting privacy.
作者
盛雪晨
陈丹伟
Sheng Xuechen;Chen Danwei(Department of Computer,Department of Software,Department of Cyberspace Security,Nanjing University of Posts and Telecommunications,Nanjing 210023)
出处
《信息安全研究》
CSCD
2023年第12期1145-1151,共7页
Journal of Information Security Research
基金
国家重点研发计划项目(2019YFB2101704)。
关键词
文本分类
分布式计算
联邦学习
差分隐私
隐私保护
text classification
distributed computing
federated learning
differential privacy
privacy protection