The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Infor...The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. We describe different black-box attacks from potential adversaries and study their impact on the amount and type of information that may be recovered from commonly used and deployed LLMs. Our research investigates the relationship between PII leakage, memorization, and factors such as model size, architecture, and the nature of attacks employed. The study utilizes two broad categories of attacks: PII leakage-focused attacks (auto-completion and extraction attacks) and memorization-focused attacks (various membership inference attacks). The findings from these investigations are quantified using an array of evaluative metrics, providing a detailed understanding of LLM vulnerabilities and the effectiveness of different attacks.展开更多
Two pairs of approximation operators, which are the scale lower and upper approximations as well as the real line lower and upper approximations, are defined. Their properties and antithesis characteristics are analyz...Two pairs of approximation operators, which are the scale lower and upper approximations as well as the real line lower and upper approximations, are defined. Their properties and antithesis characteristics are analyzed. The rough function model is generalized based on rough set theory, and the scheme of rough function theory is made more distinct and complete. Therefore, the transformation of the real function analysis from real line to scale is achieved. A series of basic concepts in rough function model including rough numbers, rough intervals, and rough membership functions are defined in the new scheme of the rough function model. Operating properties of rough intervals similar to rough sets are obtained. The relationship of rough inclusion and rough equality of rough intervals is defined by two kinds of tools, known as the lower (upper) approximation operator in real numbers domain and rough membership functions. Their relative properties are analyzed and proved strictly, which provides necessary theoretical foundation and technical support for the further discussion of properties and practical application of the rough function model.展开更多
A new fuzzy support vector machine algorithm with dual membership values based on spectral clustering method is pro- posed to overcome the shortcoming of the normal support vector machine algorithm, which divides the ...A new fuzzy support vector machine algorithm with dual membership values based on spectral clustering method is pro- posed to overcome the shortcoming of the normal support vector machine algorithm, which divides the training datasets into two absolutely exclusive classes in the binary classification, ignoring the possibility of "overlapping" region between the two training classes. The proposed method handles sample "overlap" effi- ciently with spectral clustering, overcoming the disadvantages of over-fitting well, and improving the data mining efficiency greatly. Simulation provides clear evidences to the new method.展开更多
文摘The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. We describe different black-box attacks from potential adversaries and study their impact on the amount and type of information that may be recovered from commonly used and deployed LLMs. Our research investigates the relationship between PII leakage, memorization, and factors such as model size, architecture, and the nature of attacks employed. The study utilizes two broad categories of attacks: PII leakage-focused attacks (auto-completion and extraction attacks) and memorization-focused attacks (various membership inference attacks). The findings from these investigations are quantified using an array of evaluative metrics, providing a detailed understanding of LLM vulnerabilities and the effectiveness of different attacks.
基金the Scientific Research and Development Project of Shandong Provincial Education Department(J06P01)the Science and Technology Fundation of University of Jinan (XKY0703).
文摘Two pairs of approximation operators, which are the scale lower and upper approximations as well as the real line lower and upper approximations, are defined. Their properties and antithesis characteristics are analyzed. The rough function model is generalized based on rough set theory, and the scheme of rough function theory is made more distinct and complete. Therefore, the transformation of the real function analysis from real line to scale is achieved. A series of basic concepts in rough function model including rough numbers, rough intervals, and rough membership functions are defined in the new scheme of the rough function model. Operating properties of rough intervals similar to rough sets are obtained. The relationship of rough inclusion and rough equality of rough intervals is defined by two kinds of tools, known as the lower (upper) approximation operator in real numbers domain and rough membership functions. Their relative properties are analyzed and proved strictly, which provides necessary theoretical foundation and technical support for the further discussion of properties and practical application of the rough function model.
基金supported by the National Natural Science Foundation of China (7083100170821061)
文摘A new fuzzy support vector machine algorithm with dual membership values based on spectral clustering method is pro- posed to overcome the shortcoming of the normal support vector machine algorithm, which divides the training datasets into two absolutely exclusive classes in the binary classification, ignoring the possibility of "overlapping" region between the two training classes. The proposed method handles sample "overlap" effi- ciently with spectral clustering, overcoming the disadvantages of over-fitting well, and improving the data mining efficiency greatly. Simulation provides clear evidences to the new method.