The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Infor...The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. We describe different black-box attacks from potential adversaries and study their impact on the amount and type of information that may be recovered from commonly used and deployed LLMs. Our research investigates the relationship between PII leakage, memorization, and factors such as model size, architecture, and the nature of attacks employed. The study utilizes two broad categories of attacks: PII leakage-focused attacks (auto-completion and extraction attacks) and memorization-focused attacks (various membership inference attacks). The findings from these investigations are quantified using an array of evaluative metrics, providing a detailed understanding of LLM vulnerabilities and the effectiveness of different attacks.展开更多
The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Infor...The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. This inadvertent leakage of sensitive information typically occurs when the models are subjected to black-box attacks. To address the growing concerns of safeguarding private and sensitive information while simultaneously preserving its utility, we analyze the performance of Targeted Catastrophic Forgetting (TCF). TCF involves preserving targeted pieces of sensitive information within datasets through an iterative pipeline which significantly reduces the likelihood of such information being leaked or reproduced by the model during black-box attacks, such as the autocompletion attack in our case. The experiments conducted using TCF evidently demonstrate its capability to reduce the extraction of PII while still preserving the context and utility of the target application.展开更多
From fraud detection to speech recognition,including price prediction,Machine Learning(ML)applications are manifold and can significantly improve different areas.Nevertheless,machine learning models are vulnerable and...From fraud detection to speech recognition,including price prediction,Machine Learning(ML)applications are manifold and can significantly improve different areas.Nevertheless,machine learning models are vulnerable and are exposed to different security and privacy attacks.Hence,these issues should be addressed while using ML models to preserve the security and privacy of the data used.There is a need to secure ML models,especially in the training phase to preserve the privacy of the training datasets and to minimise the information leakage.In this paper,we present an overview of ML threats and vulnerabilities,and we highlight current progress in the research works proposing defence techniques againstML security and privacy attacks.The relevant background for the different attacks occurring in both the training and testing/inferring phases is introduced before presenting a detailed overview of Membership Inference Attacks(MIA)and the related countermeasures.In this paper,we introduce a countermeasure against membership inference attacks(MIA)on Conventional Neural Networks(CNN)based on dropout and L2 regularization.Through experimental analysis,we demonstrate that this defence technique can mitigate the risks of MIA attacks while ensuring an acceptable accuracy of the model.Indeed,using CNN model training on two datasets CIFAR-10 and CIFAR-100,we empirically verify the ability of our defence strategy to decrease the impact of MIA on our model and we compare results of five different classifiers.Moreover,we present a solution to achieve a trade-off between the performance of themodel and the mitigation of MIA attack.展开更多
Deep learning can train models from a dataset to solve tasks. Although deep learning has attracted much interest owing to the excellent performance, security issues are gradually exposed. Deep learning may be prone to...Deep learning can train models from a dataset to solve tasks. Although deep learning has attracted much interest owing to the excellent performance, security issues are gradually exposed. Deep learning may be prone to the membership inference attack, where the attacker can determine the membership of a given sample. In this paper, we propose a new defense mechanism against membership inference: NoiseDA. In our proposal, a model is not directly trained on a sensitive dataset to alleviate the threat of membership inference attack by leveraging domain adaptation. Besides, a module called Feature Crafter has been designed to reduce the necessary training dataset from 2 to 1, which creates features for domain adaptation training using noise addictive mechanisms. Our experiments have shown that, with the noises properly added by Feature Crafter, our proposal can reduce the success of membership inference with a controllable utility loss.展开更多
Nowadays,machine learning(ML)algorithms cannot succeed without the availability of an enormous amount of training data.The data could contain sensitive information,which needs to be protected.Membership inference atta...Nowadays,machine learning(ML)algorithms cannot succeed without the availability of an enormous amount of training data.The data could contain sensitive information,which needs to be protected.Membership inference attacks attempt to find out whether a target data point is used to train a certain ML model,which results in security and privacy implications.The leakage of membership information can vary from one machine-learning algorithm to another.In this paper,we conduct an empirical study to explore the performance of membership inference attacks against three different machine learning algorithms,namely,K-nearest neighbors,random forest,support vector machine,and logistic regression using three datasets.Our experiments revealed the best machine learning model that can be more immune to privacy attacks.Additionally,we examined the effects of such attacks when varying the dataset size.Based on our observations for the experimental results,we propose a defense mechanism that is less prone to privacy attacks and demonstrate its effectiveness through an empirical evaluation.展开更多
Membership inference attacks on machine learning models have drawn significant attention.While current research primarily utilizes shadow modeling techniques,which require knowledge of the target model and training da...Membership inference attacks on machine learning models have drawn significant attention.While current research primarily utilizes shadow modeling techniques,which require knowledge of the target model and training data,practical scenarios involve black-box access to the target model with no available information.Limited training data further complicate the implementation of these attacks.In this paper,we experimentally compare common data enhancement schemes and propose a data synthesis framework based on the variational autoencoder generative adversarial network(VAE-GAN)to extend the training data for shadow models.Meanwhile,this paper proposes a shadow model training algorithm based on adversarial training to improve the shadow model's ability to mimic the predicted behavior of the target model when the target model's information is unknown.By conducting attack experiments on different models under the black-box access setting,this paper verifies the effectiveness of the VAE-GAN-based data synthesis framework for improving the accuracy of membership inference attack.Furthermore,we verify that the shadow model,trained by using the adversarial training approach,effectively improves the degree of mimicking the predicted behavior of the target model.Compared with existing research methods,the method proposed in this paper achieves a 2%improvement in attack accuracy and delivers better attack performance.展开更多
Membership inference(MI)attacks mainly aim to infer whether a data record was used to train a target model or not.Due to the serious privacy risks,MI attacks have been attracting a tremendous amount of attention in th...Membership inference(MI)attacks mainly aim to infer whether a data record was used to train a target model or not.Due to the serious privacy risks,MI attacks have been attracting a tremendous amount of attention in the research community.One existing work conducted-to our best knowledge the first dedicated survey study in this specific area:The survey provides a comprehensive review of the literature during the period of 2017~2021(e.g.,over 100 papers).However,due to the tremendous amount of progress(i.e.,176 papers)made in this area since 2021,the survey conducted by the one existing work has unfortunately already become very limited in the following two aspects:(1)Although the entire literature from 2017~2021 covers 18 ways to categorize(all the proposed)MI attacks,the literature during the period of 2017~2021,which was reviewed in the one existing work,only covered 5 ways to categorize MI attacks.With 13 ways missing,the survey conducted by the one existing work only covers 27%of the landscape(in terms of how to categorize MI attacks)if a retrospective view is taken.(2)Since the literature during the period of 2017~2021 only covers 27%of the landscape(in terms of how to categorize),the number of new insights(i.e.,why an MI attack could succeed)behind all the proposed MI attacks has been significantly increasing since year 2021.As a result,although none of the previous work has made the insights as a main focus of their studies,we found that the various insights leveraged in the literature can be broken down into 10 groups.Without making the insights as a main focus,a survey study could fail to help researchers gain adequate intellectual depth in this area of research.In this work,we conduct a systematic study to address these limitations.In particular,in order to address the first limitation,we make the 13 newly emerged ways to categorize MI attacks as a main focus on the study.In order to address the second limitation,we provide-to our best knowledge-the first review of the various insights leveraged in the entire literature.We found that the various insights leveraged in the literature can be broken down into 10 groups.Moreover,our survey also provides a comprehensive review of the existing defenses against MI attacks,the existing applications of MI attacks,the widely used datasets(e.g.,107 new datasets),and the eva luation metrics(e.g.,20 new evaluation metrics).展开更多
文摘The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. We describe different black-box attacks from potential adversaries and study their impact on the amount and type of information that may be recovered from commonly used and deployed LLMs. Our research investigates the relationship between PII leakage, memorization, and factors such as model size, architecture, and the nature of attacks employed. The study utilizes two broad categories of attacks: PII leakage-focused attacks (auto-completion and extraction attacks) and memorization-focused attacks (various membership inference attacks). The findings from these investigations are quantified using an array of evaluative metrics, providing a detailed understanding of LLM vulnerabilities and the effectiveness of different attacks.
文摘The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. This inadvertent leakage of sensitive information typically occurs when the models are subjected to black-box attacks. To address the growing concerns of safeguarding private and sensitive information while simultaneously preserving its utility, we analyze the performance of Targeted Catastrophic Forgetting (TCF). TCF involves preserving targeted pieces of sensitive information within datasets through an iterative pipeline which significantly reduces the likelihood of such information being leaked or reproduced by the model during black-box attacks, such as the autocompletion attack in our case. The experiments conducted using TCF evidently demonstrate its capability to reduce the extraction of PII while still preserving the context and utility of the target application.
文摘From fraud detection to speech recognition,including price prediction,Machine Learning(ML)applications are manifold and can significantly improve different areas.Nevertheless,machine learning models are vulnerable and are exposed to different security and privacy attacks.Hence,these issues should be addressed while using ML models to preserve the security and privacy of the data used.There is a need to secure ML models,especially in the training phase to preserve the privacy of the training datasets and to minimise the information leakage.In this paper,we present an overview of ML threats and vulnerabilities,and we highlight current progress in the research works proposing defence techniques againstML security and privacy attacks.The relevant background for the different attacks occurring in both the training and testing/inferring phases is introduced before presenting a detailed overview of Membership Inference Attacks(MIA)and the related countermeasures.In this paper,we introduce a countermeasure against membership inference attacks(MIA)on Conventional Neural Networks(CNN)based on dropout and L2 regularization.Through experimental analysis,we demonstrate that this defence technique can mitigate the risks of MIA attacks while ensuring an acceptable accuracy of the model.Indeed,using CNN model training on two datasets CIFAR-10 and CIFAR-100,we empirically verify the ability of our defence strategy to decrease the impact of MIA on our model and we compare results of five different classifiers.Moreover,we present a solution to achieve a trade-off between the performance of themodel and the mitigation of MIA attack.
文摘Deep learning can train models from a dataset to solve tasks. Although deep learning has attracted much interest owing to the excellent performance, security issues are gradually exposed. Deep learning may be prone to the membership inference attack, where the attacker can determine the membership of a given sample. In this paper, we propose a new defense mechanism against membership inference: NoiseDA. In our proposal, a model is not directly trained on a sensitive dataset to alleviate the threat of membership inference attack by leveraging domain adaptation. Besides, a module called Feature Crafter has been designed to reduce the necessary training dataset from 2 to 1, which creates features for domain adaptation training using noise addictive mechanisms. Our experiments have shown that, with the noises properly added by Feature Crafter, our proposal can reduce the success of membership inference with a controllable utility loss.
文摘Nowadays,machine learning(ML)algorithms cannot succeed without the availability of an enormous amount of training data.The data could contain sensitive information,which needs to be protected.Membership inference attacks attempt to find out whether a target data point is used to train a certain ML model,which results in security and privacy implications.The leakage of membership information can vary from one machine-learning algorithm to another.In this paper,we conduct an empirical study to explore the performance of membership inference attacks against three different machine learning algorithms,namely,K-nearest neighbors,random forest,support vector machine,and logistic regression using three datasets.Our experiments revealed the best machine learning model that can be more immune to privacy attacks.Additionally,we examined the effects of such attacks when varying the dataset size.Based on our observations for the experimental results,we propose a defense mechanism that is less prone to privacy attacks and demonstrate its effectiveness through an empirical evaluation.
文摘Membership inference attacks on machine learning models have drawn significant attention.While current research primarily utilizes shadow modeling techniques,which require knowledge of the target model and training data,practical scenarios involve black-box access to the target model with no available information.Limited training data further complicate the implementation of these attacks.In this paper,we experimentally compare common data enhancement schemes and propose a data synthesis framework based on the variational autoencoder generative adversarial network(VAE-GAN)to extend the training data for shadow models.Meanwhile,this paper proposes a shadow model training algorithm based on adversarial training to improve the shadow model's ability to mimic the predicted behavior of the target model when the target model's information is unknown.By conducting attack experiments on different models under the black-box access setting,this paper verifies the effectiveness of the VAE-GAN-based data synthesis framework for improving the accuracy of membership inference attack.Furthermore,we verify that the shadow model,trained by using the adversarial training approach,effectively improves the degree of mimicking the predicted behavior of the target model.Compared with existing research methods,the method proposed in this paper achieves a 2%improvement in attack accuracy and delivers better attack performance.
基金supported by National Natural Science Foundation of China(61941105,61772406,and U2336203)National Key Research and Development Program of China(2023QY1202)Beijing Natural Science Foundation(4242031).
文摘Membership inference(MI)attacks mainly aim to infer whether a data record was used to train a target model or not.Due to the serious privacy risks,MI attacks have been attracting a tremendous amount of attention in the research community.One existing work conducted-to our best knowledge the first dedicated survey study in this specific area:The survey provides a comprehensive review of the literature during the period of 2017~2021(e.g.,over 100 papers).However,due to the tremendous amount of progress(i.e.,176 papers)made in this area since 2021,the survey conducted by the one existing work has unfortunately already become very limited in the following two aspects:(1)Although the entire literature from 2017~2021 covers 18 ways to categorize(all the proposed)MI attacks,the literature during the period of 2017~2021,which was reviewed in the one existing work,only covered 5 ways to categorize MI attacks.With 13 ways missing,the survey conducted by the one existing work only covers 27%of the landscape(in terms of how to categorize MI attacks)if a retrospective view is taken.(2)Since the literature during the period of 2017~2021 only covers 27%of the landscape(in terms of how to categorize),the number of new insights(i.e.,why an MI attack could succeed)behind all the proposed MI attacks has been significantly increasing since year 2021.As a result,although none of the previous work has made the insights as a main focus of their studies,we found that the various insights leveraged in the literature can be broken down into 10 groups.Without making the insights as a main focus,a survey study could fail to help researchers gain adequate intellectual depth in this area of research.In this work,we conduct a systematic study to address these limitations.In particular,in order to address the first limitation,we make the 13 newly emerged ways to categorize MI attacks as a main focus on the study.In order to address the second limitation,we provide-to our best knowledge-the first review of the various insights leveraged in the entire literature.We found that the various insights leveraged in the literature can be broken down into 10 groups.Moreover,our survey also provides a comprehensive review of the existing defenses against MI attacks,the existing applications of MI attacks,the widely used datasets(e.g.,107 new datasets),and the eva luation metrics(e.g.,20 new evaluation metrics).