期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Data complexity-based batch sanitization method against poison in distributed learning
1
作者 Silv Wang Kai Fan +2 位作者 Kuan Zhang Hui Li Yintang Yang 《Digital Communications and Networks》 SCIE CSCD 2024年第2期416-428,共13页
The security of Federated Learning(FL)/Distributed Machine Learning(DML)is gravely threatened by data poisoning attacks,which destroy the usability of the model by contaminating training samples,so such attacks are ca... The security of Federated Learning(FL)/Distributed Machine Learning(DML)is gravely threatened by data poisoning attacks,which destroy the usability of the model by contaminating training samples,so such attacks are called causative availability indiscriminate attacks.Facing the problem that existing data sanitization methods are hard to apply to real-time applications due to their tedious process and heavy computations,we propose a new supervised batch detection method for poison,which can fleetly sanitize the training dataset before the local model training.We design a training dataset generation method that helps to enhance accuracy and uses data complexity features to train a detection model,which will be used in an efficient batch hierarchical detection process.Our model stockpiles knowledge about poison,which can be expanded by retraining to adapt to new attacks.Being neither attack-specific nor scenario-specific,our method is applicable to FL/DML or other online or offline scenarios. 展开更多
关键词 Distributed machine learning security Federated learning data poisoning attacks data sanitization Batch detection data complexity
下载PDF
An Optimized Sanitization Approach for Minable Data Publication
2
作者 Fan Yang Xiaofeng Liao 《Big Data Mining and Analytics》 EI 2022年第3期257-269,共13页
Minable data publication is ubiquitous since it is beneficial to sharing/trading data among commercial companies and further facilitates the development of data-driven tasks.Unfortunately,the minable data publication ... Minable data publication is ubiquitous since it is beneficial to sharing/trading data among commercial companies and further facilitates the development of data-driven tasks.Unfortunately,the minable data publication is often implemented by publishers with limited privacy concerns such that the published dataset is minable by malicious entities.It prohibits minable data publication since the published data may contain sensitive information.Thus,it is urgently demanded to present some approaches and technologies for reducing the privacy leakage risks.To this end,in this paper,we propose an optimized sanitization approach for minable data publication(named as SA-MDP).SA-MDP supports association rules mining function while providing privacy protection for specific rules.In SA-MDP,we consider the trade-off between the data utility and the data privacy in the minable data publication problem.To address this problem,SA-MDP designs a customized particle swarm optimization(PSO)algorithm,where the optimization objective is determined by both the data utility and the data privacy.Specifically,we take advantage of PSO to produce new particles,which is achieved by random mutation or learning from the best particle.Hence,SA-MDP can avoid the solutions being trapped into local optima.Besides,we design a proper fitness function to guide the particles to run towards the optimal solution.Additionally,we present a preprocessing method before the evolution process of the customized PSO algorithm to improve the convergence rate.Finally,the proposed SA-MDP approach is performed and verified over several datasets.The experimental results have demonstrated the effectiveness and efficiency of SA-MDP. 展开更多
关键词 data publication data sanitization association rules hiding evolutionary algorithm
原文传递
Inference Attacks on Genomic Data Based on Probabilistic Graphical Models 被引量:3
3
作者 Zaobo He Junxiu Zhou 《Big Data Mining and Analytics》 EI 2020年第3期225-233,共9页
The rapid progress and plummeting costs of human-genome sequencing enable the availability of large amount of personal biomedical information,leading to one of the most important concerns—genomic data privacy.Since p... The rapid progress and plummeting costs of human-genome sequencing enable the availability of large amount of personal biomedical information,leading to one of the most important concerns—genomic data privacy.Since personal biomedical data are highly correlated with relatives,with the increasing availability of genomes and personal traits online(i.e.,leakage unwittingly,or after their releasing intentionally to genetic service platforms),kin-genomic data privacy is threatened.We propose new inference attacks to predict unknown Single Nucleotide Polymorphisms(SNPs)and human traits of individuals in a familial genomic dataset based on probabilistic graphical models and belief propagation.With this method,the adversary can predict the unobserved genomes or traits of targeted individuals in a family genomic dataset where some individuals’genomes and traits are observed,relying on SNP-trait association from Genome-Wide Association Study(GWAS),Mendel’s Laws,and statistical relations between SNPs.Existing genome inferences have relatively high computational complexity with the input of tens of millions of SNPs and human traits.Then,we propose an approach to publish genomic data with differential privacy guarantee.After finding an approximate distribution of the input genomic dataset relying on Bayesian networks,a noisy distribution is obtained after injecting noise into the approximate distribution.Finally,synthetic genomic dataset is sampled and it is proved that any query on synthetic dataset satisfies differential privacy guarantee. 展开更多
关键词 Single Nucleotide Polymorphism(SNP)-trait association belief propagation factor graph data sanitization
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部