A cataract is one of the most significant eye problems worldwide that does not immediately impair vision and progressively worsens over time.Automatic cataract prediction based on various imaging technologies has been...A cataract is one of the most significant eye problems worldwide that does not immediately impair vision and progressively worsens over time.Automatic cataract prediction based on various imaging technologies has been addressed recently,such as smartphone apps used for remote health monitoring and eye treatment.In recent years,advances in diagnosis,prediction,and clinical decision support using Artificial Intelligence(AI)in medicine and ophthalmology have been exponential.Due to privacy concerns,a lack of data makes applying artificial intelligence models in the medical field challenging.To address this issue,a federated learning framework named CDFL based on a VGG16 deep neural network model is proposed in this research.The study collects data from the Ocular Disease Intelligent Recognition(ODIR)database containing 5,000 patient records.The significant features are extracted and normalized using the min-max normalization technique.In the federated learning-based technique,the VGG16 model is trained on the dataset individually after receiving model updates from two clients.Before transferring the attributes to the global model,the suggested method trains the local model.The global model subsequently improves the technique after integrating the new parameters.Every client analyses the results in three rounds to decrease the over-fitting problem.The experimental result shows the effectiveness of the federated learning-based technique on a Deep Neural Network(DNN),reaching a 95.28%accuracy while also providing privacy to the patient’s data.The experiment demonstrated that the suggested federated learning model outperforms other traditional methods,achieving client 1 accuracy of 95.0%and client 2 accuracy of 96.0%.展开更多
Outsourcing the k-Nearest Neighbor(kNN)classifier to the cloud is useful,yet it will lead to serious privacy leakage due to sensitive outsourced data and models.In this paper,we design,implement and evaluate a new sys...Outsourcing the k-Nearest Neighbor(kNN)classifier to the cloud is useful,yet it will lead to serious privacy leakage due to sensitive outsourced data and models.In this paper,we design,implement and evaluate a new system employing an outsourced privacy-preserving kNN Classifier Model based on Multi-Key Homomorphic Encryption(kNNCM-MKHE).We firstly propose a security protocol based on Multi-key Brakerski-Gentry-Vaikuntanathan(BGV)for collaborative evaluation of the kNN classifier provided by multiple model owners.Analyze the operations of kNN and extract basic operations,such as addition,multiplication,and comparison.It supports the computation of encrypted data with different public keys.At the same time,we further design a new scheme that outsources evaluation works to a third-party evaluator who should not have access to the models and data.In the evaluation process,each model owner encrypts the model and uploads the encrypted models to the evaluator.After receiving encrypted the kNN classifier and the user’s inputs,the evaluator calculated the aggregated results.The evaluator will perform a secure computing protocol to aggregate the number of each class label.Then,it sends the class labels with their associated counts to the user.Each model owner and user encrypt the result together.No information will be disclosed to the evaluator.The experimental results show that our new system can securely allow multiple model owners to delegate the evaluation of kNN classifier.展开更多
In the past decades,artificial intelligence(AI)has achieved unprecedented success,where statistical models become the central entity in AI.However,the centralized training and inference paradigm for building and using...In the past decades,artificial intelligence(AI)has achieved unprecedented success,where statistical models become the central entity in AI.However,the centralized training and inference paradigm for building and using these models is facing more and more privacy and legal challenges.To bridge the gap between data privacy and the need for data fusion,an emerging AI paradigm feder-ated learning(FL)has emerged as an approach for solving data silos and data privacy problems.Based on secure distributed AI,feder-ated learning emphasizes data security throughout the lifecycle,which includes the following steps:data preprocessing,training,evalu-ation,and deployments.FL keeps data security by using methods,such as secure multi-party computation(MPC),differential privacy,and hardware solutions,to build and use distributed multiple-party machine-learning systems and statistical models over different data sources.Besides data privacy concerns,we argue that the concept of“model”matters,when developing and deploying federated models,they are easy to expose to various kinds of risks including plagiarism,illegal copy,and misuse.To address these issues,we introduce FedIPR,a novel ownership verification scheme,by embedding watermarks into FL models to verify the ownership of FL models and protect model intellectual property rights(IPR or IP-right for short).While security is at the core of FL,there are still many articles re-ferred to distributed machine learning with no security guarantee as“federated learning”,which are not satisfied with the FL definition supposed to be.To this end,in this paper,we reiterate the concept of federated learning and propose secure federated learning(SFL),where the ultimate goal is to build trustworthy and safe AI with strong privacy-preserving and IP-right-preserving.We provide a com-prehensive overview of existing works,including threats,attacks,and defenses in each phase of SFL from the lifecycle perspective.展开更多
Federated learning (FL) is a promising decentralized machine learning approach that enables multiple distributed clients to train a model jointly while keeping their data private. However, in real-world scenarios, the...Federated learning (FL) is a promising decentralized machine learning approach that enables multiple distributed clients to train a model jointly while keeping their data private. However, in real-world scenarios, the supervised training data stored in local clients inevitably suffer from imperfect annotations, resulting in subjective, inconsistent and biased labels. These noisy labels can harm the collaborative aggregation process of FL by inducing inconsistent decision boundaries. Unfortunately, few attempts have been made towards noise-tolerant federated learning, with most of them relying on the strategy of transmitting overhead messages to assist noisy labels detection and correction, which increases the communication burden as well as privacy risks. In this paper, we propose a simple yet effective method for noise-tolerant FL based on the well-established co-training framework. Our method leverages the inherent discrepancy in the learning ability of the local and global models in FL, which can be regarded as two complementary views. By iteratively exchanging samples with their high confident predictions, the two models “teach each other” to suppress the influence of noisy labels. The proposed scheme enjoys the benefit of overhead cost-free and can serve as a robust and efficient baseline for noise-tolerant federated learning. Experimental results demonstrate that our method outperforms existing approaches, highlighting the superiority of our method.展开更多
This paper presents a comprehensive exploration into the integration of Internet of Things(IoT),big data analysis,cloud computing,and Artificial Intelligence(AI),which has led to an unprecedented era of connectivity.W...This paper presents a comprehensive exploration into the integration of Internet of Things(IoT),big data analysis,cloud computing,and Artificial Intelligence(AI),which has led to an unprecedented era of connectivity.We delve into the emerging trend of machine learning on embedded devices,enabling tasks in resource-limited environ-ments.However,the widespread adoption of machine learning raises significant privacy concerns,necessitating the development of privacy-preserving techniques.One such technique,secure multi-party computation(MPC),allows collaborative computations without exposing private inputs.Despite its potential,complex protocols and communication interactions hinder performance,especially on resource-constrained devices.Efforts to enhance efficiency have been made,but scalability remains a challenge.Given the success of GPUs in deep learning,lever-aging embedded GPUs,such as those offered by NVIDIA,emerges as a promising solution.Therefore,we propose an Embedded GPU-based Secure Two-party Computation(EG-STC)framework for Artificial Intelligence(AI)systems.To the best of our knowledge,this work represents the first endeavor to fully implement machine learning model training based on secure two-party computing on the Embedded GPU platform.Our experimental results demonstrate the effectiveness of EG-STC.On an embedded GPU with a power draw of 5 W,our implementation achieved a secure two-party matrix multiplication throughput of 5881.5 kilo-operations per millisecond(kops/ms),with an energy efficiency ratio of 1176.3 kops/ms/W.Furthermore,leveraging our EG-STC framework,we achieved an overall time acceleration ratio of 5–6 times compared to solutions running on server-grade CPUs.Our solution also exhibited a reduced runtime,requiring only 60%to 70%of the runtime of previously best-known methods on the same platform.In summary,our research contributes to the advancement of secure and efficient machine learning implementations on resource-constrained embedded devices,paving the way for broader adoption of AI technologies in various applications.展开更多
Outsourcing decision tree models to cloud servers can allow model providers to distribute their models at scale without purchasing dedicated hardware for model hosting.However,model providers may be forced to disclose...Outsourcing decision tree models to cloud servers can allow model providers to distribute their models at scale without purchasing dedicated hardware for model hosting.However,model providers may be forced to disclose private model details when hosting their models in the cloud.Due to the time and monetary investments associated with model training,model providers may be reluctant to host their models in the cloud due to these privacy concerns.Furthermore,clients may be reluctant to use these outsourced models because their private queries or their results may be disclosed to the cloud servers.In this paper,we propose BloomDT,a privacy-preserving scheme for decision tree inference,which uses Bloom filters to hide the original decision tree's structure,the threshold values of each node,and the order in which features are tested while maintaining reliable classification results that are secure even if the cloud servers collude.Our scheme's security and performance are verified through rigorous testing and analysis.展开更多
Deep learning can train models from a dataset to solve tasks. Although deep learning has attracted much interest owing to the excellent performance, security issues are gradually exposed. Deep learning may be prone to...Deep learning can train models from a dataset to solve tasks. Although deep learning has attracted much interest owing to the excellent performance, security issues are gradually exposed. Deep learning may be prone to the membership inference attack, where the attacker can determine the membership of a given sample. In this paper, we propose a new defense mechanism against membership inference: NoiseDA. In our proposal, a model is not directly trained on a sensitive dataset to alleviate the threat of membership inference attack by leveraging domain adaptation. Besides, a module called Feature Crafter has been designed to reduce the necessary training dataset from 2 to 1, which creates features for domain adaptation training using noise addictive mechanisms. Our experiments have shown that, with the noises properly added by Feature Crafter, our proposal can reduce the success of membership inference with a controllable utility loss.展开更多
Secure multi-party computation(MPC)allows a set of parties to jointly compute a function on their private inputs,and reveals nothing but the output of the function.In the last decade,MPC has rapidly moved from a purel...Secure multi-party computation(MPC)allows a set of parties to jointly compute a function on their private inputs,and reveals nothing but the output of the function.In the last decade,MPC has rapidly moved from a purely theoretical study to an object of practical interest,with a growing interest in practical applications such as privacy-preserving machine learning(PPML).In this paper,we comprehensively survey existing work on concretely ecient MPC protocols with both semi-honest and malicious security,in both dishonest-majority and honest-majority settings.We focus on considering the notion of security with abort,meaning that corrupted parties could prevent honest parties from receiving output after they receive output.We present high-level ideas of the basic and key approaches for designing di erent styles of MPC protocols and the crucial building blocks of MPC.For MPC applications,we compare the known PPML protocols built on MPC,and describe the eciency of private inference and training for the state-of-the-art PPML protocols.Further-more,we summarize several challenges and open problems to break though the eciency of MPC protocols as well as some interesting future work that is worth being addressed.This survey aims to provide the recent development and key approaches of MPC to researchers,who are interested in knowing,improving,and applying concretely ecient MPC protocols.展开更多
基金Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia,for funding this research work through Project Number 959.
文摘A cataract is one of the most significant eye problems worldwide that does not immediately impair vision and progressively worsens over time.Automatic cataract prediction based on various imaging technologies has been addressed recently,such as smartphone apps used for remote health monitoring and eye treatment.In recent years,advances in diagnosis,prediction,and clinical decision support using Artificial Intelligence(AI)in medicine and ophthalmology have been exponential.Due to privacy concerns,a lack of data makes applying artificial intelligence models in the medical field challenging.To address this issue,a federated learning framework named CDFL based on a VGG16 deep neural network model is proposed in this research.The study collects data from the Ocular Disease Intelligent Recognition(ODIR)database containing 5,000 patient records.The significant features are extracted and normalized using the min-max normalization technique.In the federated learning-based technique,the VGG16 model is trained on the dataset individually after receiving model updates from two clients.Before transferring the attributes to the global model,the suggested method trains the local model.The global model subsequently improves the technique after integrating the new parameters.Every client analyses the results in three rounds to decrease the over-fitting problem.The experimental result shows the effectiveness of the federated learning-based technique on a Deep Neural Network(DNN),reaching a 95.28%accuracy while also providing privacy to the patient’s data.The experiment demonstrated that the suggested federated learning model outperforms other traditional methods,achieving client 1 accuracy of 95.0%and client 2 accuracy of 96.0%.
基金supported in part by the National Natural Science Foundation of China under Grant No.61872069in part by the Fundamental Research Funds for the Central Universities under Grant N2017012.
文摘Outsourcing the k-Nearest Neighbor(kNN)classifier to the cloud is useful,yet it will lead to serious privacy leakage due to sensitive outsourced data and models.In this paper,we design,implement and evaluate a new system employing an outsourced privacy-preserving kNN Classifier Model based on Multi-Key Homomorphic Encryption(kNNCM-MKHE).We firstly propose a security protocol based on Multi-key Brakerski-Gentry-Vaikuntanathan(BGV)for collaborative evaluation of the kNN classifier provided by multiple model owners.Analyze the operations of kNN and extract basic operations,such as addition,multiplication,and comparison.It supports the computation of encrypted data with different public keys.At the same time,we further design a new scheme that outsources evaluation works to a third-party evaluator who should not have access to the models and data.In the evaluation process,each model owner encrypts the model and uploads the encrypted models to the evaluator.After receiving encrypted the kNN classifier and the user’s inputs,the evaluator calculated the aggregated results.The evaluator will perform a secure computing protocol to aggregate the number of each class label.Then,it sends the class labels with their associated counts to the user.Each model owner and user encrypt the result together.No information will be disclosed to the evaluator.The experimental results show that our new system can securely allow multiple model owners to delegate the evaluation of kNN classifier.
基金supported by National Key Research and Development Program of China(No.2018AAA 0101100).
文摘In the past decades,artificial intelligence(AI)has achieved unprecedented success,where statistical models become the central entity in AI.However,the centralized training and inference paradigm for building and using these models is facing more and more privacy and legal challenges.To bridge the gap between data privacy and the need for data fusion,an emerging AI paradigm feder-ated learning(FL)has emerged as an approach for solving data silos and data privacy problems.Based on secure distributed AI,feder-ated learning emphasizes data security throughout the lifecycle,which includes the following steps:data preprocessing,training,evalu-ation,and deployments.FL keeps data security by using methods,such as secure multi-party computation(MPC),differential privacy,and hardware solutions,to build and use distributed multiple-party machine-learning systems and statistical models over different data sources.Besides data privacy concerns,we argue that the concept of“model”matters,when developing and deploying federated models,they are easy to expose to various kinds of risks including plagiarism,illegal copy,and misuse.To address these issues,we introduce FedIPR,a novel ownership verification scheme,by embedding watermarks into FL models to verify the ownership of FL models and protect model intellectual property rights(IPR or IP-right for short).While security is at the core of FL,there are still many articles re-ferred to distributed machine learning with no security guarantee as“federated learning”,which are not satisfied with the FL definition supposed to be.To this end,in this paper,we reiterate the concept of federated learning and propose secure federated learning(SFL),where the ultimate goal is to build trustworthy and safe AI with strong privacy-preserving and IP-right-preserving.We provide a com-prehensive overview of existing works,including threats,attacks,and defenses in each phase of SFL from the lifecycle perspective.
基金supported by National Natural Science Foundation of China(Nos.92270116 and 62071155).
文摘Federated learning (FL) is a promising decentralized machine learning approach that enables multiple distributed clients to train a model jointly while keeping their data private. However, in real-world scenarios, the supervised training data stored in local clients inevitably suffer from imperfect annotations, resulting in subjective, inconsistent and biased labels. These noisy labels can harm the collaborative aggregation process of FL by inducing inconsistent decision boundaries. Unfortunately, few attempts have been made towards noise-tolerant federated learning, with most of them relying on the strategy of transmitting overhead messages to assist noisy labels detection and correction, which increases the communication burden as well as privacy risks. In this paper, we propose a simple yet effective method for noise-tolerant FL based on the well-established co-training framework. Our method leverages the inherent discrepancy in the learning ability of the local and global models in FL, which can be regarded as two complementary views. By iteratively exchanging samples with their high confident predictions, the two models “teach each other” to suppress the influence of noisy labels. The proposed scheme enjoys the benefit of overhead cost-free and can serve as a robust and efficient baseline for noise-tolerant federated learning. Experimental results demonstrate that our method outperforms existing approaches, highlighting the superiority of our method.
基金supported in part by Major Science and Technology Demonstration Project of Jiangsu Provincial Key R&D Program under Grant No.BE2023025in part by the National Natural Science Foundation of China under Grant No.62302238+2 种基金in part by the Natural Science Foundation of Jiangsu Province under Grant No.BK20220388in part by the Natural Science Research Project of Colleges and Universities in Jiangsu Province under Grant No.22KJB520004in part by the China Postdoctoral Science Foundation under Grant No.2022M711689.
文摘This paper presents a comprehensive exploration into the integration of Internet of Things(IoT),big data analysis,cloud computing,and Artificial Intelligence(AI),which has led to an unprecedented era of connectivity.We delve into the emerging trend of machine learning on embedded devices,enabling tasks in resource-limited environ-ments.However,the widespread adoption of machine learning raises significant privacy concerns,necessitating the development of privacy-preserving techniques.One such technique,secure multi-party computation(MPC),allows collaborative computations without exposing private inputs.Despite its potential,complex protocols and communication interactions hinder performance,especially on resource-constrained devices.Efforts to enhance efficiency have been made,but scalability remains a challenge.Given the success of GPUs in deep learning,lever-aging embedded GPUs,such as those offered by NVIDIA,emerges as a promising solution.Therefore,we propose an Embedded GPU-based Secure Two-party Computation(EG-STC)framework for Artificial Intelligence(AI)systems.To the best of our knowledge,this work represents the first endeavor to fully implement machine learning model training based on secure two-party computing on the Embedded GPU platform.Our experimental results demonstrate the effectiveness of EG-STC.On an embedded GPU with a power draw of 5 W,our implementation achieved a secure two-party matrix multiplication throughput of 5881.5 kilo-operations per millisecond(kops/ms),with an energy efficiency ratio of 1176.3 kops/ms/W.Furthermore,leveraging our EG-STC framework,we achieved an overall time acceleration ratio of 5–6 times compared to solutions running on server-grade CPUs.Our solution also exhibited a reduced runtime,requiring only 60%to 70%of the runtime of previously best-known methods on the same platform.In summary,our research contributes to the advancement of secure and efficient machine learning implementations on resource-constrained embedded devices,paving the way for broader adoption of AI technologies in various applications.
基金supported by collaborative research funding from the National Research Council of Canada's Aging in Place Challenge Program.
文摘Outsourcing decision tree models to cloud servers can allow model providers to distribute their models at scale without purchasing dedicated hardware for model hosting.However,model providers may be forced to disclose private model details when hosting their models in the cloud.Due to the time and monetary investments associated with model training,model providers may be reluctant to host their models in the cloud due to these privacy concerns.Furthermore,clients may be reluctant to use these outsourced models because their private queries or their results may be disclosed to the cloud servers.In this paper,we propose BloomDT,a privacy-preserving scheme for decision tree inference,which uses Bloom filters to hide the original decision tree's structure,the threshold values of each node,and the order in which features are tested while maintaining reliable classification results that are secure even if the cloud servers collude.Our scheme's security and performance are verified through rigorous testing and analysis.
文摘Deep learning can train models from a dataset to solve tasks. Although deep learning has attracted much interest owing to the excellent performance, security issues are gradually exposed. Deep learning may be prone to the membership inference attack, where the attacker can determine the membership of a given sample. In this paper, we propose a new defense mechanism against membership inference: NoiseDA. In our proposal, a model is not directly trained on a sensitive dataset to alleviate the threat of membership inference attack by leveraging domain adaptation. Besides, a module called Feature Crafter has been designed to reduce the necessary training dataset from 2 to 1, which creates features for domain adaptation training using noise addictive mechanisms. Our experiments have shown that, with the noises properly added by Feature Crafter, our proposal can reduce the success of membership inference with a controllable utility loss.
基金the National Key Research and Development Program of China(Grant No.2018YFB0804105)in part by the National Natural Science Foundation of China(Grant Nos.62102037,61932019).
文摘Secure multi-party computation(MPC)allows a set of parties to jointly compute a function on their private inputs,and reveals nothing but the output of the function.In the last decade,MPC has rapidly moved from a purely theoretical study to an object of practical interest,with a growing interest in practical applications such as privacy-preserving machine learning(PPML).In this paper,we comprehensively survey existing work on concretely ecient MPC protocols with both semi-honest and malicious security,in both dishonest-majority and honest-majority settings.We focus on considering the notion of security with abort,meaning that corrupted parties could prevent honest parties from receiving output after they receive output.We present high-level ideas of the basic and key approaches for designing di erent styles of MPC protocols and the crucial building blocks of MPC.For MPC applications,we compare the known PPML protocols built on MPC,and describe the eciency of private inference and training for the state-of-the-art PPML protocols.Further-more,we summarize several challenges and open problems to break though the eciency of MPC protocols as well as some interesting future work that is worth being addressed.This survey aims to provide the recent development and key approaches of MPC to researchers,who are interested in knowing,improving,and applying concretely ecient MPC protocols.