In the context of edge computing environments in general and the metaverse in particular,federated learning(FL)has emerged as a distributed machine learning paradigm that allows multiple users to collaborate on traini...In the context of edge computing environments in general and the metaverse in particular,federated learning(FL)has emerged as a distributed machine learning paradigm that allows multiple users to collaborate on training a shared machine learning model locally,eliminating the need for uploading raw data to a central server.It is perhaps the only training paradigm that preserves the privacy of user data,which is essential for computing environments as personal as the metaverse.However,the original FL architecture proposed is not scalable to a large number of user devices in the metaverse community.To mitigate this problem,hierarchical federated learning(HFL)has been introduced as a general distributed learning paradigm,inspiring a number of research works.In this paper,we present several types of HFL architectures,with a special focus on the three-layer client-edge-cloud HFL architecture,which is most pertinent to the metaverse due to its delay-sensitive nature.We also examine works that take advantage of the natural layered organization of three-layer client-edge-cloud HFL to tackle some of the most challenging problems in FL within the metaverse.Finally,we outline some future research directions of HFL in the metaverse.展开更多
Benefiting from the development of Federated Learning(FL)and distributed communication systems,large-scale intelligent applications become possible.Distributed devices not only provide adequate training data,but also ...Benefiting from the development of Federated Learning(FL)and distributed communication systems,large-scale intelligent applications become possible.Distributed devices not only provide adequate training data,but also cause privacy leakage and energy consumption.How to optimize the energy consumption in distributed communication systems,while ensuring the privacy of users and model accuracy,has become an urgent challenge.In this paper,we define the FL as a 3-layer architecture including users,agents and server.In order to find a balance among model training accuracy,privacy-preserving effect,and energy consumption,we design the training process of FL as game models.We use an extensive game tree to analyze the key elements that influence the players’decisions in the single game,and then find the incentive mechanism that meet the social norms through the repeated game.The experimental results show that the Nash equilibrium we obtained satisfies the laws of reality,and the proposed incentive mechanism can also promote users to submit high-quality data in FL.Following the multiple rounds of play,the incentive mechanism can help all players find the optimal strategies for energy,privacy,and accuracy of FL in distributed communication systems.展开更多
In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining ...In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.展开更多
Data sharing and privacy protection are made possible by federated learning,which allows for continuous model parameter sharing between several clients and a central server.Multiple reliable and high-quality clients m...Data sharing and privacy protection are made possible by federated learning,which allows for continuous model parameter sharing between several clients and a central server.Multiple reliable and high-quality clients must participate in practical applications for the federated learning global model to be accurate,but because the clients are independent,the central server cannot fully control their behavior.The central server has no way of knowing the correctness of the model parameters provided by each client in this round,so clients may purposefully or unwittingly submit anomalous data,leading to abnormal behavior,such as becoming malicious attackers or defective clients.To reduce their negative consequences,it is crucial to quickly detect these abnormalities and incentivize them.In this paper,we propose a Federated Learning framework for Detecting and Incentivizing Abnormal Clients(FL-DIAC)to accomplish efficient and security federated learning.We build a detector that introduces an auto-encoder for anomaly detection and use it to perform anomaly identification and prevent the involvement of abnormal clients,in particular for the anomaly client detection problem.Among them,before the model parameters are input to the detector,we propose a Fourier transform-based anomaly data detectionmethod for dimensionality reduction in order to reduce the computational complexity.Additionally,we create a credit scorebased incentive structure to encourage clients to participate in training in order tomake clients actively participate.Three training models(CNN,MLP,and ResNet-18)and three datasets(MNIST,Fashion MNIST,and CIFAR-10)have been used in experiments.According to theoretical analysis and experimental findings,the FL-DIAC is superior to other federated learning schemes of the same type in terms of effectiveness.展开更多
The increasing data pool in finance sectors forces machine learning(ML)to step into new complications.Banking data has significant financial implications and is confidential.Combining users data from several organizat...The increasing data pool in finance sectors forces machine learning(ML)to step into new complications.Banking data has significant financial implications and is confidential.Combining users data from several organizations for various banking services may result in various intrusions and privacy leakages.As a result,this study employs federated learning(FL)using a flower paradigm to preserve each organization’s privacy while collaborating to build a robust shared global model.However,diverse data distributions in the collaborative training process might result in inadequate model learning and a lack of privacy.To address this issue,the present paper proposes the imple-mentation of Federated Averaging(FedAvg)and Federated Proximal(FedProx)methods in the flower framework,which take advantage of the data locality while training and guaranteeing global convergence.Resultantly improves the privacy of the local models.This analysis used the credit card and Canadian Institute for Cybersecurity Intrusion Detection Evaluation(CICIDS)datasets.Precision,recall,and accuracy as performance indicators to show the efficacy of the proposed strategy using FedAvg and FedProx.The experimental findings suggest that the proposed approach helps to safely use banking data from diverse sources to enhance customer banking services by obtaining accuracy of 99.55%and 83.72%for FedAvg and 99.57%,and 84.63%for FedProx.展开更多
With the rapid development of the Internet,network security and data privacy are increasingly valued.Although classical Network Intrusion Detection System(NIDS)based on Deep Learning(DL)models can provide good detecti...With the rapid development of the Internet,network security and data privacy are increasingly valued.Although classical Network Intrusion Detection System(NIDS)based on Deep Learning(DL)models can provide good detection accuracy,but collecting samples for centralized training brings the huge risk of data privacy leakage.Furthermore,the training of supervised deep learning models requires a large number of labeled samples,which is usually cumbersome.The“black-box”problem also makes the DL models of NIDS untrustworthy.In this paper,we propose a trusted Federated Learning(FL)Traffic IDS method called FL-TIDS to address the above-mentioned problems.In FL-TIDS,we design an unsupervised intrusion detection model based on autoencoders that alleviates the reliance on marked samples.At the same time,we use FL for model training to protect data privacy.In addition,we design an improved SHAP interpretable method based on chi-square test to perform interpretable analysis of the trained model.We conducted several experiments to evaluate the proposed FL-TIDS.We first determine experimentally the structure and the number of neurons of the unsupervised AE model.Secondly,we evaluated the proposed method using the UNSW-NB15 and CICIDS2017 datasets.The exper-imental results show that the unsupervised AE model has better performance than the other 7 intrusion detection models in terms of precision,recall and f1-score.Then,federated learning is used to train the intrusion detection model.The experimental results indicate that the model is more accurate than the local learning model.Finally,we use an improved SHAP explainability method based on Chi-square test to analyze the explainability.The analysis results show that the identification characteristics of the model are consistent with the attack characteristics,and the model is reliable.展开更多
Federated learning is an important distributed model training technique in Internet of Things(IoT),in which participant selection is a key component that plays a role in improving training efficiency and model accurac...Federated learning is an important distributed model training technique in Internet of Things(IoT),in which participant selection is a key component that plays a role in improving training efficiency and model accuracy.This module enables a central server to select a subset of participants to performmodel training based on data and device information.By doing so,selected participants are rewarded and actively perform model training,while participants that are detrimental to training efficiency and model accuracy are excluded.However,in practice,participants may suspect that the central server may have miscalculated and thus not made the selection honestly.This lack of trustworthiness problem,which can demotivate participants,has received little attention.Another problem that has received little attention is the leakage of participants’private information during the selection process.We will therefore propose a federated learning framework with auditable participant selection.It supports smart contracts in selecting a set of suitable participants based on their training loss without compromising the privacy.Considering the possibility of malicious campaigning and impersonation of participants,the framework employs commitment schemes and zero-knowledge proofs to counteract these malicious behaviors.Finally,we analyze the security of the framework and conduct a series of experiments to demonstrate that the framework can effectively improve the efficiency of federated learning.展开更多
As a representative emerging machine learning technique, federated learning(FL) has gained considerable popularity for its special feature of “making data available but not visible”. However, potential problems rema...As a representative emerging machine learning technique, federated learning(FL) has gained considerable popularity for its special feature of “making data available but not visible”. However, potential problems remain, including privacy breaches, imbalances in payment, and inequitable distribution.These shortcomings let devices reluctantly contribute relevant data to, or even refuse to participate in FL. Therefore, in the application of FL, an important but also challenging issue is to motivate as many participants as possible to provide high-quality data to FL. In this paper, we propose an incentive mechanism for FL based on the continuous zero-determinant(CZD) strategies from the perspective of game theory. We first model the interaction between the server and the devices during the FL process as a continuous iterative game. We then apply the CZD strategies for two players and then multiple players to optimize the social welfare of FL, for which we prove that the server can keep social welfare at a high and stable level. Subsequently, we design an incentive mechanism based on the CZD strategies to attract devices to contribute all of their high-accuracy data to FL.Finally, we perform simulations to demonstrate that our proposed CZD-based incentive mechanism can indeed generate high and stable social welfare in FL.展开更多
The proliferation of IoT devices requires innovative approaches to gaining insights while preserving privacy and resources amid unprecedented data generation.However,FL development for IoT is still in its infancy and ...The proliferation of IoT devices requires innovative approaches to gaining insights while preserving privacy and resources amid unprecedented data generation.However,FL development for IoT is still in its infancy and needs to be explored in various areas to understand the key challenges for deployment in real-world scenarios.The paper systematically reviewed the available literature using the PRISMA guiding principle.The study aims to provide a detailed overview of the increasing use of FL in IoT networks,including the architecture and challenges.A systematic review approach is used to collect,categorize and analyze FL-IoT-based articles.Asearch was performed in the IEEE,Elsevier,Arxiv,ACM,and WOS databases and 92 articles were finally examined.Inclusion measures were published in English and with the keywords“FL”and“IoT”.The methodology begins with an overview of recent advances in FL and the IoT,followed by a discussion of how these two technologies can be integrated.To be more specific,we examine and evaluate the capabilities of FL by talking about communication protocols,frameworks and architecture.We then present a comprehensive analysis of the use of FL in a number of key IoT applications,including smart healthcare,smart transportation,smart cities,smart industry,smart finance,and smart agriculture.The key findings from this analysis of FL IoT services and applications are also presented.Finally,we performed a comparative analysis with FL IID(independent and identical data)and non-ID,traditional centralized deep learning(DL)approaches.We concluded that FL has better performance,especially in terms of privacy protection and resource utilization.FL is excellent for preserving privacy becausemodel training takes place on individual devices or edge nodes,eliminating the need for centralized data aggregation,which poses significant privacy risks.To facilitate development in this rapidly evolving field,the insights presented are intended to help practitioners and researchers navigate the complex terrain of FL and IoT.展开更多
Federated learning is an innovative machine learning technique that deals with centralized data storage issues while maintaining privacy and security.It involves constructing machine learning models using datasets spr...Federated learning is an innovative machine learning technique that deals with centralized data storage issues while maintaining privacy and security.It involves constructing machine learning models using datasets spread across several data centers,including medical facilities,clinical research facilities,Internet of Things devices,and even mobile devices.The main goal of federated learning is to improve robust models that benefit from the collective knowledge of these disparate datasets without centralizing sensitive information,reducing the risk of data loss,privacy breaches,or data exposure.The application of federated learning in the healthcare industry holds significant promise due to the wealth of data generated from various sources,such as patient records,medical imaging,wearable devices,and clinical research surveys.This research conducts a systematic evaluation and highlights essential issues for the selection and implementation of federated learning approaches in healthcare.It evaluates the effectiveness of federated learning strategies in the field of healthcare.It offers a systematic analysis of federated learning in the healthcare domain,encompassing the evaluation metrics employed.In addition,this study highlights the increasing interest in federated learning applications in healthcare among scholars and provides foundations for further studies.展开更多
As an emerging joint learning model,federated learning is a promising way to combine model parameters of different users for training and inference without collecting users’original data.However,a practical and effic...As an emerging joint learning model,federated learning is a promising way to combine model parameters of different users for training and inference without collecting users’original data.However,a practical and efficient solution has not been established in previous work due to the absence of efficient matrix computation and cryptography schemes in the privacy-preserving federated learning model,especially in partially homomorphic cryptosystems.In this paper,we propose a Practical and Efficient Privacy-preserving Federated Learning(PEPFL)framework.First,we present a lifted distributed ElGamal cryptosystem for federated learning,which can solve the multi-key problem in federated learning.Secondly,we develop a Practical Partially Single Instruction Multiple Data(PSIMD)parallelism scheme that can encode a plaintext matrix into single plaintext for encryption,improving the encryption efficiency and reducing the communication cost in partially homomorphic cryptosystem.In addition,based on the Convolutional Neural Network(CNN)and the designed cryptosystem,a novel privacy-preserving federated learning framework is designed by using Momentum Gradient Descent(MGD).Finally,we evaluate the security and performance of PEPFL.The experiment results demonstrate that the scheme is practicable,effective,and secure with low communication and computation costs.展开更多
The application of artificial intelligence technology in Internet of Vehicles(lov)has attracted great research interests with the goal of enabling smart transportation and traffic management.Meanwhile,concerns have be...The application of artificial intelligence technology in Internet of Vehicles(lov)has attracted great research interests with the goal of enabling smart transportation and traffic management.Meanwhile,concerns have been raised over the security and privacy of the tons of traffic and vehicle data.In this regard,Federated Learning(FL)with privacy protection features is considered a highly promising solution.However,in the FL process,the server side may take advantage of its dominant role in model aggregation to steal sensitive information of users,while the client side may also upload malicious data to compromise the training of the global model.Most existing privacy-preserving FL schemes in IoV fail to deal with threats from both of these two sides at the same time.In this paper,we propose a Blockchain based Privacy-preserving Federated Learning scheme named BPFL,which uses blockchain as the underlying distributed framework of FL.We improve the Multi-Krum technology and combine it with the homomorphic encryption to achieve ciphertext-level model aggregation and model filtering,which can enable the verifiability of the local models while achieving privacy-preservation.Additionally,we develop a reputation-based incentive mechanism to encourage users in IoV to actively participate in the federated learning and to practice honesty.The security analysis and performance evaluations are conducted to show that the proposed scheme can meet the security requirements and improve the performance of the FL model.展开更多
With the increasing awareness of privacy protection and the improvement of relevant laws,federal learning has gradually become a new choice for cross-agency and cross-device machine learning.In order to solve the prob...With the increasing awareness of privacy protection and the improvement of relevant laws,federal learning has gradually become a new choice for cross-agency and cross-device machine learning.In order to solve the problems of privacy leakage,high computational overhead and high traffic in some federated learning schemes,this paper proposes amultiplicative double privacymask algorithm which is convenient for homomorphic addition aggregation.The combination of homomorphic encryption and secret sharing ensures that the server cannot compromise user privacy from the private gradient uploaded by the participants.At the same time,the proposed TQRR(Top-Q-Random-R)gradient selection algorithm is used to filter the gradient of encryption and upload efficiently,which reduces the computing overhead of 51.78%and the traffic of 64.87%on the premise of ensuring the accuracy of themodel,whichmakes the framework of privacy protection federated learning lighter to adapt to more miniaturized federated learning terminals.展开更多
With the prevalence of the Internet of Things(IoT)systems,smart cities comprise complex networks,including sensors,actuators,appliances,and cyber services.The complexity and heterogeneity of smart cities have become v...With the prevalence of the Internet of Things(IoT)systems,smart cities comprise complex networks,including sensors,actuators,appliances,and cyber services.The complexity and heterogeneity of smart cities have become vulnerable to sophisticated cyber-attacks,especially privacy-related attacks such as inference and data poisoning ones.Federated Learning(FL)has been regarded as a hopeful method to enable distributed learning with privacypreserved intelligence in IoT applications.Even though the significance of developing privacy-preserving FL has drawn as a great research interest,the current research only concentrates on FL with independent identically distributed(i.i.d)data and few studies have addressed the non-i.i.d setting.FL is known to be vulnerable to Generative Adversarial Network(GAN)attacks,where an adversary can presume to act as a contributor participating in the training process to acquire the private data of other contributors.This paper proposes an innovative Privacy Protection-based Federated Deep Learning(PP-FDL)framework,which accomplishes data protection against privacy-related GAN attacks,along with high classification rates from non-i.i.d data.PP-FDL is designed to enable fog nodes to cooperate to train the FDL model in a way that ensures contributors have no access to the data of each other,where class probabilities are protected utilizing a private identifier generated for each class.The PP-FDL framework is evaluated for image classification using simple convolutional networks which are trained using MNIST and CIFAR-10 datasets.The empirical results have revealed that PF-DFL can achieve data protection and the framework outperforms the other three state-of-the-art models with 3%–8%as accuracy improvements.展开更多
The rapid expansion of artificial intelligence(AI)applications has raised significant concerns about user privacy,prompting the development of privacy-preserving machine learning(ML)paradigms such as federated learnin...The rapid expansion of artificial intelligence(AI)applications has raised significant concerns about user privacy,prompting the development of privacy-preserving machine learning(ML)paradigms such as federated learning(FL).FL enables the distributed training of ML models,keeping data on local devices and thus addressing the privacy concerns of users.However,challenges arise from the heterogeneous nature of mobile client devices,partial engagement of training,and non-independent identically distributed(non-IID)data distribution,leading to performance degradation and optimization objective bias in FL training.With the development of 5G/6G networks and the integration of cloud computing edge computing resources,globally distributed cloud computing resources can be effectively utilized to optimize the FL process.Through the specific parameters of the server through the selection mechanism,it does not increase the monetary cost and reduces the network latency overhead,but also balances the objectives of communication optimization and low engagement mitigation that cannot be achieved simultaneously in a single-server framework of existing works.In this paper,we propose the FedAdaSS algorithm,an adaptive parameter server selection mechanism designed to optimize the training efficiency in each round of FL training by selecting the most appropriate server as the parameter server.Our approach leverages the flexibility of cloud resource computing power,and allows organizers to strategically select servers for data broadcasting and aggregation,thus improving training performance while maintaining cost efficiency.The FedAdaSS algorithm estimates the utility of client systems and servers and incorporates an adaptive random reshuffling strategy that selects the optimal server in each round of the training process.Theoretical analysis confirms the convergence of FedAdaSS under strong convexity and L-smooth assumptions,and comparative experiments within the FLSim framework demonstrate a reduction in training round-to-accuracy by 12%–20%compared to the Federated Averaging(FedAvg)with random reshuffling method under unique server.Furthermore,FedAdaSS effectively mitigates performance loss caused by low client engagement,reducing the loss indicator by 50%.展开更多
Sharing data while protecting privacy in the industrial Internet is a significant challenge.Traditional machine learning methods require a combination of all data for training;however,this approach can be limited by d...Sharing data while protecting privacy in the industrial Internet is a significant challenge.Traditional machine learning methods require a combination of all data for training;however,this approach can be limited by data availability and privacy concerns.Federated learning(FL)has gained considerable attention because it allows for decentralized training on multiple local datasets.However,the training data collected by data providers are often non-independent and identically distributed(non-IID),resulting in poor FL performance.This paper proposes a privacy-preserving approach for sharing non-IID data in the industrial Internet using an FL approach based on blockchain technology.To overcome the problem of non-IID data leading to poor training accuracy,we propose dynamically updating the local model based on the divergence of the global and local models.This approach can significantly improve the accuracy of FL training when there is relatively large dispersion.In addition,we design a dynamic gradient clipping algorithm to alleviate the influence of noise on the model accuracy to reduce potential privacy leakage caused by sharing model parameters.Finally,we evaluate the performance of the proposed scheme using commonly used open-source image datasets.The simulation results demonstrate that our method can significantly enhance the accuracy while protecting privacy and maintaining efficiency,thereby providing a new solution to data-sharing and privacy-protection challenges in the industrial Internet.展开更多
When data privacy is imposed as a necessity,Federated learning(FL)emerges as a relevant artificial intelligence field for developing machine learning(ML)models in a distributed and decentralized environment.FL allows ...When data privacy is imposed as a necessity,Federated learning(FL)emerges as a relevant artificial intelligence field for developing machine learning(ML)models in a distributed and decentralized environment.FL allows ML models to be trained on local devices without any need for centralized data transfer,thereby reducing both the exposure of sensitive data and the possibility of data interception by malicious third parties.This paradigm has gained momentum in the last few years,spurred by the plethora of real-world applications that have leveraged its ability to improve the efficiency of distributed learning and to accommodate numerous participants with their data sources.By virtue of FL,models can be learned from all such distributed data sources while preserving data privacy.The aim of this paper is to provide a practical tutorial on FL,including a short methodology and a systematic analysis of existing software frameworks.Furthermore,our tutorial provides exemplary cases of study from three complementary perspectives:i)Foundations of FL,describing the main components of FL,from key elements to FL categories;ii)Implementation guidelines and exemplary cases of study,by systematically examining the functionalities provided by existing software frameworks for FL deployment,devising a methodology to design a FL scenario,and providing exemplary cases of study with source code for different ML approaches;and iii)Trends,shortly reviewing a non-exhaustive list of research directions that are under active investigation in the current FL landscape.The ultimate purpose of this work is to establish itself as a referential work for researchers,developers,and data scientists willing to explore the capabilities of FL in practical applications.展开更多
To protect vehicular privacy and speed up the execution of tasks,federated learning is introduced in the Internet of Vehicles(IoV)where users execute model training locally and upload local models to the base station ...To protect vehicular privacy and speed up the execution of tasks,federated learning is introduced in the Internet of Vehicles(IoV)where users execute model training locally and upload local models to the base station without massive raw data exchange.However,heterogeneous computing and communication resources of vehicles cause straggler effect which weakens the reliability of federated learning.Dropping out vehicles with limited resources confines the training data.As a result,the accuracy and applicability of federated learning models will be reduced.To mitigate the straggler effect and improve performance of federated learning,we propose a reconfigurable intelligent surface(RIS)-assisted federated learning framework to enhance the communication reliability for parameter transmission in the IoV.Furthermore,we optimize the phase shift of RIS to achieve a more reliable communication environment.In addition,we define vehicular competence to measure both vehicular trustworthiness and resources.Based on the vehicular competence,the straggler effect is mitigated where training tasks of computing stragglers are offloaded to surrounding vehicles with high competence.The experiment results verify that our proposed framework can improve the reliability of federated learning in terms of computing and communication in the IoV.展开更多
As the volume of healthcare and medical data increases from diverse sources,real-world scenarios involving data sharing and collaboration have certain challenges,including the risk of privacy leakage,difficulty in dat...As the volume of healthcare and medical data increases from diverse sources,real-world scenarios involving data sharing and collaboration have certain challenges,including the risk of privacy leakage,difficulty in data fusion,low reliability of data storage,low effectiveness of data sharing,etc.To guarantee the service quality of data collaboration,this paper presents a privacy-preserving Healthcare and Medical Data Collaboration Service System combining Blockchain with Federated Learning,termed FL-HMChain.This system is composed of three layers:Data extraction and storage,data management,and data application.Focusing on healthcare and medical data,a healthcare and medical blockchain is constructed to realize data storage,transfer,processing,and access with security,real-time,reliability,and integrity.An improved master node selection consensus mechanism is presented to detect and prevent dishonest behavior,ensuring the overall reliability and trustworthiness of the collaborative model training process.Furthermore,healthcare and medical data collaboration services in real-world scenarios have been discussed and developed.To further validate the performance of FL-HMChain,a Convolutional Neural Network-based Federated Learning(FL-CNN-HMChain)model is investigated for medical image identification.This model achieves better performance compared to the baseline Convolutional Neural Network(CNN),having an average improvement of 4.7%on Area Under Curve(AUC)and 7%on Accuracy(ACC),respectively.Furthermore,the probability of privacy leakage can be effectively reduced by the blockchain-based parameter transfer mechanism in federated learning between local and global models.展开更多
Digital Twin(DT)supports real time analysis and provides a reliable simulation platform in the Internet of Things(IoT).The creation and application of DT hinges on amounts of data,which poses pressure on the applicati...Digital Twin(DT)supports real time analysis and provides a reliable simulation platform in the Internet of Things(IoT).The creation and application of DT hinges on amounts of data,which poses pressure on the application of Artificial Intelligence(AI)for DT descriptions and intelligent decision-making.Federated Learning(FL)is a cutting-edge technology that enables geographically dispersed devices to collaboratively train a shared global model locally rather than relying on a data center to perform model training.Therefore,DT can benefit by combining with FL,successfully solving the"data island"problem in traditional AI.However,FL still faces serious challenges,such as enduring single-point failures,suffering from poison attacks,lacking effective incentive mechanisms.Before the successful deployment of DT,we should tackle the issues caused by FL.Researchers from industry and academia have recognized the potential of introducing Blockchain Technology(BT)into FL to overcome the challenges faced by FL,where BT acting as a distributed and immutable ledger,can store data in a secure,traceable,and trusted manner.However,to the best of our knowledge,a comprehensive literature review on this topic is still missing.In this paper,we review existing works about blockchain-enabled FL and visualize their prospects with DT.To this end,we first propose evaluation requirements with respect to security,faulttolerance,fairness,efficiency,cost-saving,profitability,and support for heterogeneity.Then,we classify existing literature according to the functionalities of BT in FL and analyze their advantages and disadvantages based on the proposed evaluation requirements.Finally,we discuss open problems in the existing literature and the future of DT supported by blockchain-enabled FL,based on which we further propose some directions for future research.展开更多
文摘In the context of edge computing environments in general and the metaverse in particular,federated learning(FL)has emerged as a distributed machine learning paradigm that allows multiple users to collaborate on training a shared machine learning model locally,eliminating the need for uploading raw data to a central server.It is perhaps the only training paradigm that preserves the privacy of user data,which is essential for computing environments as personal as the metaverse.However,the original FL architecture proposed is not scalable to a large number of user devices in the metaverse community.To mitigate this problem,hierarchical federated learning(HFL)has been introduced as a general distributed learning paradigm,inspiring a number of research works.In this paper,we present several types of HFL architectures,with a special focus on the three-layer client-edge-cloud HFL architecture,which is most pertinent to the metaverse due to its delay-sensitive nature.We also examine works that take advantage of the natural layered organization of three-layer client-edge-cloud HFL to tackle some of the most challenging problems in FL within the metaverse.Finally,we outline some future research directions of HFL in the metaverse.
基金sponsored by the National Key R&D Program of China(No.2018YFB2100400)the National Natural Science Foundation of China(No.62002077,61872100)+4 种基金the Major Research Plan of the National Natural Science Foundation of China(92167203)the Guangdong Basic and Applied Basic Research Foundation(No.2020A1515110385)the China Postdoctoral Science Foundation(No.2022M710860)the Zhejiang Lab(No.2020NF0AB01)Guangzhou Science and Technology Plan Project(202102010440).
文摘Benefiting from the development of Federated Learning(FL)and distributed communication systems,large-scale intelligent applications become possible.Distributed devices not only provide adequate training data,but also cause privacy leakage and energy consumption.How to optimize the energy consumption in distributed communication systems,while ensuring the privacy of users and model accuracy,has become an urgent challenge.In this paper,we define the FL as a 3-layer architecture including users,agents and server.In order to find a balance among model training accuracy,privacy-preserving effect,and energy consumption,we design the training process of FL as game models.We use an extensive game tree to analyze the key elements that influence the players’decisions in the single game,and then find the incentive mechanism that meet the social norms through the repeated game.The experimental results show that the Nash equilibrium we obtained satisfies the laws of reality,and the proposed incentive mechanism can also promote users to submit high-quality data in FL.Following the multiple rounds of play,the incentive mechanism can help all players find the optimal strategies for energy,privacy,and accuracy of FL in distributed communication systems.
基金This research was funded by the National Natural Science Foundation of China(No.62272124)the National Key Research and Development Program of China(No.2022YFB2701401)+3 种基金Guizhou Province Science and Technology Plan Project(Grant Nos.Qiankehe Paltform Talent[2020]5017)The Research Project of Guizhou University for Talent Introduction(No.[2020]61)the Cultivation Project of Guizhou University(No.[2019]56)the Open Fund of Key Laboratory of Advanced Manufacturing Technology,Ministry of Education(GZUAMT2021KF[01]).
文摘In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.
基金supported by Key Research and Development Program of China (No.2022YFC3005401)Key Research and Development Program of Yunnan Province,China (Nos.202203AA080009,202202AF080003)+1 种基金Science and Technology Achievement Transformation Program of Jiangsu Province,China (BA2021002)Fundamental Research Funds for the Central Universities (Nos.B220203006,B210203024).
文摘Data sharing and privacy protection are made possible by federated learning,which allows for continuous model parameter sharing between several clients and a central server.Multiple reliable and high-quality clients must participate in practical applications for the federated learning global model to be accurate,but because the clients are independent,the central server cannot fully control their behavior.The central server has no way of knowing the correctness of the model parameters provided by each client in this round,so clients may purposefully or unwittingly submit anomalous data,leading to abnormal behavior,such as becoming malicious attackers or defective clients.To reduce their negative consequences,it is crucial to quickly detect these abnormalities and incentivize them.In this paper,we propose a Federated Learning framework for Detecting and Incentivizing Abnormal Clients(FL-DIAC)to accomplish efficient and security federated learning.We build a detector that introduces an auto-encoder for anomaly detection and use it to perform anomaly identification and prevent the involvement of abnormal clients,in particular for the anomaly client detection problem.Among them,before the model parameters are input to the detector,we propose a Fourier transform-based anomaly data detectionmethod for dimensionality reduction in order to reduce the computational complexity.Additionally,we create a credit scorebased incentive structure to encourage clients to participate in training in order tomake clients actively participate.Three training models(CNN,MLP,and ResNet-18)and three datasets(MNIST,Fashion MNIST,and CIFAR-10)have been used in experiments.According to theoretical analysis and experimental findings,the FL-DIAC is superior to other federated learning schemes of the same type in terms of effectiveness.
文摘The increasing data pool in finance sectors forces machine learning(ML)to step into new complications.Banking data has significant financial implications and is confidential.Combining users data from several organizations for various banking services may result in various intrusions and privacy leakages.As a result,this study employs federated learning(FL)using a flower paradigm to preserve each organization’s privacy while collaborating to build a robust shared global model.However,diverse data distributions in the collaborative training process might result in inadequate model learning and a lack of privacy.To address this issue,the present paper proposes the imple-mentation of Federated Averaging(FedAvg)and Federated Proximal(FedProx)methods in the flower framework,which take advantage of the data locality while training and guaranteeing global convergence.Resultantly improves the privacy of the local models.This analysis used the credit card and Canadian Institute for Cybersecurity Intrusion Detection Evaluation(CICIDS)datasets.Precision,recall,and accuracy as performance indicators to show the efficacy of the proposed strategy using FedAvg and FedProx.The experimental findings suggest that the proposed approach helps to safely use banking data from diverse sources to enhance customer banking services by obtaining accuracy of 99.55%and 83.72%for FedAvg and 99.57%,and 84.63%for FedProx.
基金supported by National Natural Science Fundation of China under Grant 61972208National Natural Science Fundation(General Program)of China under Grant 61972211+2 种基金National Key Research and Development Project of China under Grant 2020YFB1804700Future Network Innovation Research and Application Projects under Grant No.2021FNA020062021 Jiangsu Postgraduate Research Innovation Plan under Grant No.KYCX210794.
文摘With the rapid development of the Internet,network security and data privacy are increasingly valued.Although classical Network Intrusion Detection System(NIDS)based on Deep Learning(DL)models can provide good detection accuracy,but collecting samples for centralized training brings the huge risk of data privacy leakage.Furthermore,the training of supervised deep learning models requires a large number of labeled samples,which is usually cumbersome.The“black-box”problem also makes the DL models of NIDS untrustworthy.In this paper,we propose a trusted Federated Learning(FL)Traffic IDS method called FL-TIDS to address the above-mentioned problems.In FL-TIDS,we design an unsupervised intrusion detection model based on autoencoders that alleviates the reliance on marked samples.At the same time,we use FL for model training to protect data privacy.In addition,we design an improved SHAP interpretable method based on chi-square test to perform interpretable analysis of the trained model.We conducted several experiments to evaluate the proposed FL-TIDS.We first determine experimentally the structure and the number of neurons of the unsupervised AE model.Secondly,we evaluated the proposed method using the UNSW-NB15 and CICIDS2017 datasets.The exper-imental results show that the unsupervised AE model has better performance than the other 7 intrusion detection models in terms of precision,recall and f1-score.Then,federated learning is used to train the intrusion detection model.The experimental results indicate that the model is more accurate than the local learning model.Finally,we use an improved SHAP explainability method based on Chi-square test to analyze the explainability.The analysis results show that the identification characteristics of the model are consistent with the attack characteristics,and the model is reliable.
基金supported by the Key-Area Research and Development Program of Guangdong Province under Grant No.2020B0101090004the National Natural Science Foundation of China under Grant No.62072215,the Guangzhou Basic Research Plan City-School Joint Funding Project under Grant No.2024A03J0405+1 种基金the Guangzhou Basic and Applied Basic Research Foundation under Grant No.2024A04J3458the State Archives Administration Science and Technology Program Plan of China under Grant 2023-X-028.
文摘Federated learning is an important distributed model training technique in Internet of Things(IoT),in which participant selection is a key component that plays a role in improving training efficiency and model accuracy.This module enables a central server to select a subset of participants to performmodel training based on data and device information.By doing so,selected participants are rewarded and actively perform model training,while participants that are detrimental to training efficiency and model accuracy are excluded.However,in practice,participants may suspect that the central server may have miscalculated and thus not made the selection honestly.This lack of trustworthiness problem,which can demotivate participants,has received little attention.Another problem that has received little attention is the leakage of participants’private information during the selection process.We will therefore propose a federated learning framework with auditable participant selection.It supports smart contracts in selecting a set of suitable participants based on their training loss without compromising the privacy.Considering the possibility of malicious campaigning and impersonation of participants,the framework employs commitment schemes and zero-knowledge proofs to counteract these malicious behaviors.Finally,we analyze the security of the framework and conduct a series of experiments to demonstrate that the framework can effectively improve the efficiency of federated learning.
基金partially supported by the National Natural Science Foundation of China (62173308)the Natural Science Foundation of Zhejiang Province of China (LR20F030001)the Jinhua Science and Technology Project (2022-1-042)。
文摘As a representative emerging machine learning technique, federated learning(FL) has gained considerable popularity for its special feature of “making data available but not visible”. However, potential problems remain, including privacy breaches, imbalances in payment, and inequitable distribution.These shortcomings let devices reluctantly contribute relevant data to, or even refuse to participate in FL. Therefore, in the application of FL, an important but also challenging issue is to motivate as many participants as possible to provide high-quality data to FL. In this paper, we propose an incentive mechanism for FL based on the continuous zero-determinant(CZD) strategies from the perspective of game theory. We first model the interaction between the server and the devices during the FL process as a continuous iterative game. We then apply the CZD strategies for two players and then multiple players to optimize the social welfare of FL, for which we prove that the server can keep social welfare at a high and stable level. Subsequently, we design an incentive mechanism based on the CZD strategies to attract devices to contribute all of their high-accuracy data to FL.Finally, we perform simulations to demonstrate that our proposed CZD-based incentive mechanism can indeed generate high and stable social welfare in FL.
文摘The proliferation of IoT devices requires innovative approaches to gaining insights while preserving privacy and resources amid unprecedented data generation.However,FL development for IoT is still in its infancy and needs to be explored in various areas to understand the key challenges for deployment in real-world scenarios.The paper systematically reviewed the available literature using the PRISMA guiding principle.The study aims to provide a detailed overview of the increasing use of FL in IoT networks,including the architecture and challenges.A systematic review approach is used to collect,categorize and analyze FL-IoT-based articles.Asearch was performed in the IEEE,Elsevier,Arxiv,ACM,and WOS databases and 92 articles were finally examined.Inclusion measures were published in English and with the keywords“FL”and“IoT”.The methodology begins with an overview of recent advances in FL and the IoT,followed by a discussion of how these two technologies can be integrated.To be more specific,we examine and evaluate the capabilities of FL by talking about communication protocols,frameworks and architecture.We then present a comprehensive analysis of the use of FL in a number of key IoT applications,including smart healthcare,smart transportation,smart cities,smart industry,smart finance,and smart agriculture.The key findings from this analysis of FL IoT services and applications are also presented.Finally,we performed a comparative analysis with FL IID(independent and identical data)and non-ID,traditional centralized deep learning(DL)approaches.We concluded that FL has better performance,especially in terms of privacy protection and resource utilization.FL is excellent for preserving privacy becausemodel training takes place on individual devices or edge nodes,eliminating the need for centralized data aggregation,which poses significant privacy risks.To facilitate development in this rapidly evolving field,the insights presented are intended to help practitioners and researchers navigate the complex terrain of FL and IoT.
基金This work was supported by a research fund from Chosun University,2023。
文摘Federated learning is an innovative machine learning technique that deals with centralized data storage issues while maintaining privacy and security.It involves constructing machine learning models using datasets spread across several data centers,including medical facilities,clinical research facilities,Internet of Things devices,and even mobile devices.The main goal of federated learning is to improve robust models that benefit from the collective knowledge of these disparate datasets without centralizing sensitive information,reducing the risk of data loss,privacy breaches,or data exposure.The application of federated learning in the healthcare industry holds significant promise due to the wealth of data generated from various sources,such as patient records,medical imaging,wearable devices,and clinical research surveys.This research conducts a systematic evaluation and highlights essential issues for the selection and implementation of federated learning approaches in healthcare.It evaluates the effectiveness of federated learning strategies in the field of healthcare.It offers a systematic analysis of federated learning in the healthcare domain,encompassing the evaluation metrics employed.In addition,this study highlights the increasing interest in federated learning applications in healthcare among scholars and provides foundations for further studies.
基金supported by the National Natural Science Foundation of China under Grant No.U19B2021the Key Research and Development Program of Shaanxi under Grant No.2020ZDLGY08-04+1 种基金the Key Technologies R&D Program of He’nan Province under Grant No.212102210084the Innovation Scientists and Technicians Troop Construction Projects of Henan Province.
文摘As an emerging joint learning model,federated learning is a promising way to combine model parameters of different users for training and inference without collecting users’original data.However,a practical and efficient solution has not been established in previous work due to the absence of efficient matrix computation and cryptography schemes in the privacy-preserving federated learning model,especially in partially homomorphic cryptosystems.In this paper,we propose a Practical and Efficient Privacy-preserving Federated Learning(PEPFL)framework.First,we present a lifted distributed ElGamal cryptosystem for federated learning,which can solve the multi-key problem in federated learning.Secondly,we develop a Practical Partially Single Instruction Multiple Data(PSIMD)parallelism scheme that can encode a plaintext matrix into single plaintext for encryption,improving the encryption efficiency and reducing the communication cost in partially homomorphic cryptosystem.In addition,based on the Convolutional Neural Network(CNN)and the designed cryptosystem,a novel privacy-preserving federated learning framework is designed by using Momentum Gradient Descent(MGD).Finally,we evaluate the security and performance of PEPFL.The experiment results demonstrate that the scheme is practicable,effective,and secure with low communication and computation costs.
基金supported by the National Natural Science Foundation of China under Grant 61972148.
文摘The application of artificial intelligence technology in Internet of Vehicles(lov)has attracted great research interests with the goal of enabling smart transportation and traffic management.Meanwhile,concerns have been raised over the security and privacy of the tons of traffic and vehicle data.In this regard,Federated Learning(FL)with privacy protection features is considered a highly promising solution.However,in the FL process,the server side may take advantage of its dominant role in model aggregation to steal sensitive information of users,while the client side may also upload malicious data to compromise the training of the global model.Most existing privacy-preserving FL schemes in IoV fail to deal with threats from both of these two sides at the same time.In this paper,we propose a Blockchain based Privacy-preserving Federated Learning scheme named BPFL,which uses blockchain as the underlying distributed framework of FL.We improve the Multi-Krum technology and combine it with the homomorphic encryption to achieve ciphertext-level model aggregation and model filtering,which can enable the verifiability of the local models while achieving privacy-preservation.Additionally,we develop a reputation-based incentive mechanism to encourage users in IoV to actively participate in the federated learning and to practice honesty.The security analysis and performance evaluations are conducted to show that the proposed scheme can meet the security requirements and improve the performance of the FL model.
基金supported by the National Natural Science Foundation of China(Grant Nos.62172436,62102452)the National Key Research and Development Program of China(2023YFB3106100,2021YFB3100100)the Natural Science Foundation of Shaanxi Province(2023-JC-YB-584).
文摘With the increasing awareness of privacy protection and the improvement of relevant laws,federal learning has gradually become a new choice for cross-agency and cross-device machine learning.In order to solve the problems of privacy leakage,high computational overhead and high traffic in some federated learning schemes,this paper proposes amultiplicative double privacymask algorithm which is convenient for homomorphic addition aggregation.The combination of homomorphic encryption and secret sharing ensures that the server cannot compromise user privacy from the private gradient uploaded by the participants.At the same time,the proposed TQRR(Top-Q-Random-R)gradient selection algorithm is used to filter the gradient of encryption and upload efficiently,which reduces the computing overhead of 51.78%and the traffic of 64.87%on the premise of ensuring the accuracy of themodel,whichmakes the framework of privacy protection federated learning lighter to adapt to more miniaturized federated learning terminals.
文摘With the prevalence of the Internet of Things(IoT)systems,smart cities comprise complex networks,including sensors,actuators,appliances,and cyber services.The complexity and heterogeneity of smart cities have become vulnerable to sophisticated cyber-attacks,especially privacy-related attacks such as inference and data poisoning ones.Federated Learning(FL)has been regarded as a hopeful method to enable distributed learning with privacypreserved intelligence in IoT applications.Even though the significance of developing privacy-preserving FL has drawn as a great research interest,the current research only concentrates on FL with independent identically distributed(i.i.d)data and few studies have addressed the non-i.i.d setting.FL is known to be vulnerable to Generative Adversarial Network(GAN)attacks,where an adversary can presume to act as a contributor participating in the training process to acquire the private data of other contributors.This paper proposes an innovative Privacy Protection-based Federated Deep Learning(PP-FDL)framework,which accomplishes data protection against privacy-related GAN attacks,along with high classification rates from non-i.i.d data.PP-FDL is designed to enable fog nodes to cooperate to train the FDL model in a way that ensures contributors have no access to the data of each other,where class probabilities are protected utilizing a private identifier generated for each class.The PP-FDL framework is evaluated for image classification using simple convolutional networks which are trained using MNIST and CIFAR-10 datasets.The empirical results have revealed that PF-DFL can achieve data protection and the framework outperforms the other three state-of-the-art models with 3%–8%as accuracy improvements.
基金supported in part by the National Natural Science Foundation of China under Grant U22B2005,Grant 62372462.
文摘The rapid expansion of artificial intelligence(AI)applications has raised significant concerns about user privacy,prompting the development of privacy-preserving machine learning(ML)paradigms such as federated learning(FL).FL enables the distributed training of ML models,keeping data on local devices and thus addressing the privacy concerns of users.However,challenges arise from the heterogeneous nature of mobile client devices,partial engagement of training,and non-independent identically distributed(non-IID)data distribution,leading to performance degradation and optimization objective bias in FL training.With the development of 5G/6G networks and the integration of cloud computing edge computing resources,globally distributed cloud computing resources can be effectively utilized to optimize the FL process.Through the specific parameters of the server through the selection mechanism,it does not increase the monetary cost and reduces the network latency overhead,but also balances the objectives of communication optimization and low engagement mitigation that cannot be achieved simultaneously in a single-server framework of existing works.In this paper,we propose the FedAdaSS algorithm,an adaptive parameter server selection mechanism designed to optimize the training efficiency in each round of FL training by selecting the most appropriate server as the parameter server.Our approach leverages the flexibility of cloud resource computing power,and allows organizers to strategically select servers for data broadcasting and aggregation,thus improving training performance while maintaining cost efficiency.The FedAdaSS algorithm estimates the utility of client systems and servers and incorporates an adaptive random reshuffling strategy that selects the optimal server in each round of the training process.Theoretical analysis confirms the convergence of FedAdaSS under strong convexity and L-smooth assumptions,and comparative experiments within the FLSim framework demonstrate a reduction in training round-to-accuracy by 12%–20%compared to the Federated Averaging(FedAvg)with random reshuffling method under unique server.Furthermore,FedAdaSS effectively mitigates performance loss caused by low client engagement,reducing the loss indicator by 50%.
基金This work was supported by the National Key R&D Program of China under Grant 2023YFB2703802the Hunan Province Innovation and Entrepreneurship Training Program for College Students S202311528073.
文摘Sharing data while protecting privacy in the industrial Internet is a significant challenge.Traditional machine learning methods require a combination of all data for training;however,this approach can be limited by data availability and privacy concerns.Federated learning(FL)has gained considerable attention because it allows for decentralized training on multiple local datasets.However,the training data collected by data providers are often non-independent and identically distributed(non-IID),resulting in poor FL performance.This paper proposes a privacy-preserving approach for sharing non-IID data in the industrial Internet using an FL approach based on blockchain technology.To overcome the problem of non-IID data leading to poor training accuracy,we propose dynamically updating the local model based on the divergence of the global and local models.This approach can significantly improve the accuracy of FL training when there is relatively large dispersion.In addition,we design a dynamic gradient clipping algorithm to alleviate the influence of noise on the model accuracy to reduce potential privacy leakage caused by sharing model parameters.Finally,we evaluate the performance of the proposed scheme using commonly used open-source image datasets.The simulation results demonstrate that our method can significantly enhance the accuracy while protecting privacy and maintaining efficiency,thereby providing a new solution to data-sharing and privacy-protection challenges in the industrial Internet.
基金the R&D&I,Spain grants PID2020-119478GB-I00 and,PID2020-115832GB-I00 funded by MCIN/AEI/10.13039/501100011033.N.Rodríguez-Barroso was supported by the grant FPU18/04475 funded by MCIN/AEI/10.13039/501100011033 and by“ESF Investing in your future”Spain.J.Moyano was supported by a postdoctoral Juan de la Cierva Formación grant FJC2020-043823-I funded by MCIN/AEI/10.13039/501100011033 and by European Union NextGenerationEU/PRTR.J.Del Ser acknowledges funding support from the Spanish Centro para el Desarrollo Tecnológico Industrial(CDTI)through the AI4ES projectthe Department of Education of the Basque Government(consolidated research group MATHMODE,IT1456-22)。
文摘When data privacy is imposed as a necessity,Federated learning(FL)emerges as a relevant artificial intelligence field for developing machine learning(ML)models in a distributed and decentralized environment.FL allows ML models to be trained on local devices without any need for centralized data transfer,thereby reducing both the exposure of sensitive data and the possibility of data interception by malicious third parties.This paradigm has gained momentum in the last few years,spurred by the plethora of real-world applications that have leveraged its ability to improve the efficiency of distributed learning and to accommodate numerous participants with their data sources.By virtue of FL,models can be learned from all such distributed data sources while preserving data privacy.The aim of this paper is to provide a practical tutorial on FL,including a short methodology and a systematic analysis of existing software frameworks.Furthermore,our tutorial provides exemplary cases of study from three complementary perspectives:i)Foundations of FL,describing the main components of FL,from key elements to FL categories;ii)Implementation guidelines and exemplary cases of study,by systematically examining the functionalities provided by existing software frameworks for FL deployment,devising a methodology to design a FL scenario,and providing exemplary cases of study with source code for different ML approaches;and iii)Trends,shortly reviewing a non-exhaustive list of research directions that are under active investigation in the current FL landscape.The ultimate purpose of this work is to establish itself as a referential work for researchers,developers,and data scientists willing to explore the capabilities of FL in practical applications.
基金supported in part by the Fundamental Research Funds for the Central Universities (2022JBQY004)the Beijing Natural Science Foundation L211013+4 种基金the Basic Research Program under Grant JCKY2022XXXX145the National Natural Science Foundation of China (No. 62221001,62201030)the Science and Technology Research and Development Plan of China Railway Co., Ltd (No. K2022G018)the project of CHN Energy Shuohuang Railway under Grant SHTL-2332the China Postdoctoral Science Foundation 2021TQ0028,2021M700369
文摘To protect vehicular privacy and speed up the execution of tasks,federated learning is introduced in the Internet of Vehicles(IoV)where users execute model training locally and upload local models to the base station without massive raw data exchange.However,heterogeneous computing and communication resources of vehicles cause straggler effect which weakens the reliability of federated learning.Dropping out vehicles with limited resources confines the training data.As a result,the accuracy and applicability of federated learning models will be reduced.To mitigate the straggler effect and improve performance of federated learning,we propose a reconfigurable intelligent surface(RIS)-assisted federated learning framework to enhance the communication reliability for parameter transmission in the IoV.Furthermore,we optimize the phase shift of RIS to achieve a more reliable communication environment.In addition,we define vehicular competence to measure both vehicular trustworthiness and resources.Based on the vehicular competence,the straggler effect is mitigated where training tasks of computing stragglers are offloaded to surrounding vehicles with high competence.The experiment results verify that our proposed framework can improve the reliability of federated learning in terms of computing and communication in the IoV.
基金We are thankful for the funding support fromthe Science and Technology Projects of the National Archives Administration of China(Grant Number 2022-R-031)the Fundamental Research Funds for the Central Universities,Central China Normal University(Grant Number CCNU24CG014).
文摘As the volume of healthcare and medical data increases from diverse sources,real-world scenarios involving data sharing and collaboration have certain challenges,including the risk of privacy leakage,difficulty in data fusion,low reliability of data storage,low effectiveness of data sharing,etc.To guarantee the service quality of data collaboration,this paper presents a privacy-preserving Healthcare and Medical Data Collaboration Service System combining Blockchain with Federated Learning,termed FL-HMChain.This system is composed of three layers:Data extraction and storage,data management,and data application.Focusing on healthcare and medical data,a healthcare and medical blockchain is constructed to realize data storage,transfer,processing,and access with security,real-time,reliability,and integrity.An improved master node selection consensus mechanism is presented to detect and prevent dishonest behavior,ensuring the overall reliability and trustworthiness of the collaborative model training process.Furthermore,healthcare and medical data collaboration services in real-world scenarios have been discussed and developed.To further validate the performance of FL-HMChain,a Convolutional Neural Network-based Federated Learning(FL-CNN-HMChain)model is investigated for medical image identification.This model achieves better performance compared to the baseline Convolutional Neural Network(CNN),having an average improvement of 4.7%on Area Under Curve(AUC)and 7%on Accuracy(ACC),respectively.Furthermore,the probability of privacy leakage can be effectively reduced by the blockchain-based parameter transfer mechanism in federated learning between local and global models.
基金supported in part by the National Natural Science Foundation of China under Grant 62072351in part by the Academy of Finland under Grant 308087,Grant 335262,Grant 345072,and Grant 350464+1 种基金in part by the Open Project of Zhejiang Lab under Grant 2021PD0AB01in part by the 111 Project under Grant B16037.
文摘Digital Twin(DT)supports real time analysis and provides a reliable simulation platform in the Internet of Things(IoT).The creation and application of DT hinges on amounts of data,which poses pressure on the application of Artificial Intelligence(AI)for DT descriptions and intelligent decision-making.Federated Learning(FL)is a cutting-edge technology that enables geographically dispersed devices to collaboratively train a shared global model locally rather than relying on a data center to perform model training.Therefore,DT can benefit by combining with FL,successfully solving the"data island"problem in traditional AI.However,FL still faces serious challenges,such as enduring single-point failures,suffering from poison attacks,lacking effective incentive mechanisms.Before the successful deployment of DT,we should tackle the issues caused by FL.Researchers from industry and academia have recognized the potential of introducing Blockchain Technology(BT)into FL to overcome the challenges faced by FL,where BT acting as a distributed and immutable ledger,can store data in a secure,traceable,and trusted manner.However,to the best of our knowledge,a comprehensive literature review on this topic is still missing.In this paper,we review existing works about blockchain-enabled FL and visualize their prospects with DT.To this end,we first propose evaluation requirements with respect to security,faulttolerance,fairness,efficiency,cost-saving,profitability,and support for heterogeneity.Then,we classify existing literature according to the functionalities of BT in FL and analyze their advantages and disadvantages based on the proposed evaluation requirements.Finally,we discuss open problems in the existing literature and the future of DT supported by blockchain-enabled FL,based on which we further propose some directions for future research.