期刊文献+

面向隐私安全的联邦决策树算法 被引量:8

Federated Decision Tree Algorithm for Privacy Security
下载PDF
导出
摘要 根据用户信息进行资质审查是金融领域的一项重要业务,银行等机构由于用户数据不足和隐私安全等原因,无法训练高性能的违约风险评估模型,从而无法对用户进行精准预测.因此,为了解决数据不共享情况下的联合建模问题,本文提出一种基于联邦学习的决策树算法FL-DT(Federated Learning-Decision Tree).首先,构造基于直方图的数据存储结构用于通信传输,通过减少通信次数,有效提升训练效率;其次,提出基于不经意传输的混淆布隆过滤器进行隐私集合求交,得到包含各参与方数据信息的联邦直方图,并建立联邦决策树模型.最后,提出多方协作预测算法,提升了FL-DT的预测效率.在四个常用的金融数据集上,评估了FL-DT算法的精确性和有效性.实验结果表明,FL-DT算法的准确率比仅利用本地数据建立模型的准确率高,逼近于数据集中情况下模型的准确率,而且优于其他联邦学习方法.另外,FL-DT的训练效率也优于已有算法. In recent years,with the vigorous development of technology and its related industries,Internet finance has increasingly highlighted its advantages.For a long time,qualification review based on the user information has been a fairly important business in the financial field.In most cases,when an individual applies for a loan from a bank,the bank will evaluate him or her through the actual situation based on the established predictive model to determine whether to grant the loan.In this process,a high-quality default risk assessment can avoid unnecessary losses for the banks.However,there are still many deficiencies in the current research on the assessment of default risks of borrowers by banks and other lending institutions.On the one hand,it is difficult to build a high-quality prediction model due to the lack of user data;on the other hand,people are paying more and more attention to the privacy protection of personal data,it is also tough work for banks to obtain a large amount of relative data,and because of that,they cannot carry out the prediction models to accurately predict users’situation.In order to solve the problem of joint modeling in the case of data is not shared,this paper introduces the idea of thefederated learning to effectively utilize the value of other participants’data without the leaving of local data to establish a shared predictive model.Because decision tree algorithms are widely used in financial risk controlling and fraud identification,this paper proposes a decision tree algorithm FL-DT(Federated Learning-Decision Tree)based on federated learning.Federated learning is the concept put forward by Google in 2016,which can complete joint modeling without data sharing.Specifically,the data of each owner will not leave the local place,and the global sharing model will be jointly established through the parameter exchange method under the encryption mechanism in the federal system(in the case of not violating data privacy protection regulations).Moreover,each participant only serves for the local targets.Firstly,a data storage structure based on a histogram is presented for communication transmission,which can effectively improve training efficiency by reducing the number of communications.Secondly,the garbled Bloom filter based on an oblivious transfer is proposed to perform the privacy set intersection,and then we can obtain the federated histogram containing the data information of each participant,and establishes the federated decision tree model.Finally,amulti-party collaboration prediction algorithm is put forward to improve the prediction efficiency of FL-DT.Based on four commonly used data sets in the financial field,this article assesses the accuracy and effectiveness of the FL-DT algorithm.The experimental results show that the prediction accuracy of the FL-DT model is higher than that of the model established using only local data,which is close to the model built in the case of data concentration.In addition,the prediction accuracy of the FL-DT methods is better than other federated learning methods,and the training efficiency and prediction efficiency are also better than other algorithms.
作者 郭艳卿 王鑫磊 付海燕 刘航 姚明 GUO Yan-Qing;WANG Xin-Lei;FU Hai-Yan;LIU Hang;YAO Ming(School of Information and Communication Engineering,Dalian University of Technology,Dalian,Liaoning 116024;Data Intelligence Department of InsightOne Tech Co,Ltd,Beijing 100007)
出处 《计算机学报》 EI CAS CSCD 北大核心 2021年第10期2090-2103,共14页 Chinese Journal of Computers
基金 国家自然科学基金(No.62076052,No.U1736119) 中央高校基本科研业务费(No.DUT20TD110,No.DUT20RC(3)088)资助.
关键词 联邦学习 决策树 混淆布隆过滤器 隐私安全 数据不共享 federated learning decision tree garbled bloom filter privacy security data not sharing
  • 相关文献

参考文献6

二级参考文献88

共引文献281

同被引文献84

引证文献8

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部