As machine learning moves into high-risk and sensitive applications such as medical care,autonomous driving,and financial planning,how to interpret the predictions of the black-box model becomes the key to whether peo...As machine learning moves into high-risk and sensitive applications such as medical care,autonomous driving,and financial planning,how to interpret the predictions of the black-box model becomes the key to whether people can trust machine learning decisions.Interpretability relies on providing users with additional information or explanations to improve model transparency and help users understand model decisions.However,these information inevitably leads to the dataset or model into the risk of privacy leaks.We propose a strategy to reduce model privacy leakage for instance interpretability techniques.The following is the specific operation process.Firstly,the user inputs data into the model,and the model calculates the prediction confidence of the data provided by the user and gives the prediction results.Meanwhile,the model obtains the prediction confidence of the interpretation data set.Finally,the data with the smallest Euclidean distance between the confidence of the interpretation set and the prediction data as the explainable data.Experimental results show that The Euclidean distance between the confidence of interpretation data and the confidence of prediction data provided by this method is very small,which shows that the model's prediction of interpreted data is very similar to the model's prediction of user data.Finally,we demonstrate the accuracy of the explanatory data.We measure the matching degree between the real label and the predicted label of the interpreted data and the applicability to the network model.The results show that the interpretation method has high accuracy and wide applicability.展开更多
It is necessary to confirm the personal data factors and the rules of verification before conducting personal data detection. So that the detection method can be written in the subsequent implementation of the automat...It is necessary to confirm the personal data factors and the rules of verification before conducting personal data detection. So that the detection method can be written in the subsequent implementation of the automatic detection tool. This paper will conduct experiments on common personal data factor rules, including domestic personal identity numbers and credit card numbers with checksums. We use ChatGPT to test the accuracy of identifying personal information like ID card identification numbers or credit card numbers. And then use personal data correlation to reduce the time for personal data identification. Although the number of personal information factors found has decreased, it has had a better effect on the actual manual personal data identification. The result shows that it saves about 45% of the calculation time, and the execution efficiency of the accuracy is also improved with the original method by about 22%, which is about 2.2 times higher than the general method. Therefore, the method proposed in this paper can accurately and effectively find out the leftover personal information in the enterprise. .展开更多
The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Infor...The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. This inadvertent leakage of sensitive information typically occurs when the models are subjected to black-box attacks. To address the growing concerns of safeguarding private and sensitive information while simultaneously preserving its utility, we analyze the performance of Targeted Catastrophic Forgetting (TCF). TCF involves preserving targeted pieces of sensitive information within datasets through an iterative pipeline which significantly reduces the likelihood of such information being leaked or reproduced by the model during black-box attacks, such as the autocompletion attack in our case. The experiments conducted using TCF evidently demonstrate its capability to reduce the extraction of PII while still preserving the context and utility of the target application.展开更多
This paper examines how cybersecurity is developing and how it relates to more conventional information security. Although information security and cyber security are sometimes used synonymously, this study contends t...This paper examines how cybersecurity is developing and how it relates to more conventional information security. Although information security and cyber security are sometimes used synonymously, this study contends that they are not the same. The concept of cyber security is explored, which goes beyond protecting information resources to include a wider variety of assets, including people [1]. Protecting information assets is the main goal of traditional information security, with consideration to the human element and how people fit into the security process. On the other hand, cyber security adds a new level of complexity, as people might unintentionally contribute to or become targets of cyberattacks. This aspect presents moral questions since it is becoming more widely accepted that society has a duty to protect weaker members of society, including children [1]. The study emphasizes how important cyber security is on a larger scale, with many countries creating plans and laws to counteract cyberattacks. Nevertheless, a lot of these sources frequently neglect to define the differences or the relationship between information security and cyber security [1]. The paper focus on differentiating between cybersecurity and information security on a larger scale. The study also highlights other areas of cybersecurity which includes defending people, social norms, and vital infrastructure from threats that arise from online in addition to information and technology protection. It contends that ethical issues and the human factor are becoming more and more important in protecting assets in the digital age, and that cyber security is a paradigm shift in this regard [1].展开更多
基金This work is supported by the National Natural Science Foundation of China(Grant No.61966011)Hainan University Education and Teaching Reform Research Project(Grant No.HDJWJG01)+3 种基金Key Research and Development Program of Hainan Province(Grant No.ZDYF2020033)Young Talents’Science and Technology Innovation Project of Hainan Association for Science and Technology(Grant No.QCXM202007)Hainan Provincial Natural Science Foundation of China(Grant No.621RC612)Hainan Provincial Natural Science Foundation of China(Grant No.2019RC107).
文摘As machine learning moves into high-risk and sensitive applications such as medical care,autonomous driving,and financial planning,how to interpret the predictions of the black-box model becomes the key to whether people can trust machine learning decisions.Interpretability relies on providing users with additional information or explanations to improve model transparency and help users understand model decisions.However,these information inevitably leads to the dataset or model into the risk of privacy leaks.We propose a strategy to reduce model privacy leakage for instance interpretability techniques.The following is the specific operation process.Firstly,the user inputs data into the model,and the model calculates the prediction confidence of the data provided by the user and gives the prediction results.Meanwhile,the model obtains the prediction confidence of the interpretation data set.Finally,the data with the smallest Euclidean distance between the confidence of the interpretation set and the prediction data as the explainable data.Experimental results show that The Euclidean distance between the confidence of interpretation data and the confidence of prediction data provided by this method is very small,which shows that the model's prediction of interpreted data is very similar to the model's prediction of user data.Finally,we demonstrate the accuracy of the explanatory data.We measure the matching degree between the real label and the predicted label of the interpreted data and the applicability to the network model.The results show that the interpretation method has high accuracy and wide applicability.
文摘It is necessary to confirm the personal data factors and the rules of verification before conducting personal data detection. So that the detection method can be written in the subsequent implementation of the automatic detection tool. This paper will conduct experiments on common personal data factor rules, including domestic personal identity numbers and credit card numbers with checksums. We use ChatGPT to test the accuracy of identifying personal information like ID card identification numbers or credit card numbers. And then use personal data correlation to reduce the time for personal data identification. Although the number of personal information factors found has decreased, it has had a better effect on the actual manual personal data identification. The result shows that it saves about 45% of the calculation time, and the execution efficiency of the accuracy is also improved with the original method by about 22%, which is about 2.2 times higher than the general method. Therefore, the method proposed in this paper can accurately and effectively find out the leftover personal information in the enterprise. .
文摘The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. This inadvertent leakage of sensitive information typically occurs when the models are subjected to black-box attacks. To address the growing concerns of safeguarding private and sensitive information while simultaneously preserving its utility, we analyze the performance of Targeted Catastrophic Forgetting (TCF). TCF involves preserving targeted pieces of sensitive information within datasets through an iterative pipeline which significantly reduces the likelihood of such information being leaked or reproduced by the model during black-box attacks, such as the autocompletion attack in our case. The experiments conducted using TCF evidently demonstrate its capability to reduce the extraction of PII while still preserving the context and utility of the target application.
文摘This paper examines how cybersecurity is developing and how it relates to more conventional information security. Although information security and cyber security are sometimes used synonymously, this study contends that they are not the same. The concept of cyber security is explored, which goes beyond protecting information resources to include a wider variety of assets, including people [1]. Protecting information assets is the main goal of traditional information security, with consideration to the human element and how people fit into the security process. On the other hand, cyber security adds a new level of complexity, as people might unintentionally contribute to or become targets of cyberattacks. This aspect presents moral questions since it is becoming more widely accepted that society has a duty to protect weaker members of society, including children [1]. The study emphasizes how important cyber security is on a larger scale, with many countries creating plans and laws to counteract cyberattacks. Nevertheless, a lot of these sources frequently neglect to define the differences or the relationship between information security and cyber security [1]. The paper focus on differentiating between cybersecurity and information security on a larger scale. The study also highlights other areas of cybersecurity which includes defending people, social norms, and vital infrastructure from threats that arise from online in addition to information and technology protection. It contends that ethical issues and the human factor are becoming more and more important in protecting assets in the digital age, and that cyber security is a paradigm shift in this regard [1].