Centralized training of deep learning models poses privacy risks that hinder their deployment.Federated learning(FL)has emerged as a solution to address these risks,allowing multiple clients to train deep learning mod...Centralized training of deep learning models poses privacy risks that hinder their deployment.Federated learning(FL)has emerged as a solution to address these risks,allowing multiple clients to train deep learning models collaborativelywithout sharing rawdata.However,FL is vulnerable to the impact of heterogeneous distributed data,which weakens convergence stability and suboptimal performance of the trained model on local data.This is due to the discarding of the old local model at each round of training,which results in the loss of personalized information in the model critical for maintaining model accuracy and ensuring robustness.In this paper,we propose FedTC,a personalized federated learning method with two classifiers that can retain personalized information in the local model and improve the model’s performance on local data.FedTC divides the model into two parts,namely,the extractor and the classifier,where the classifier is the last layer of the model,and the extractor consists of other layers.The classifier in the local model is always retained to ensure that the personalized information is not lost.After receiving the global model,the local extractor is overwritten by the globalmodel’s extractor,and the classifier of the globalmodel serves as anadditional classifier of the localmodel toguide local training.The FedTCintroduces a two-classifier training strategy to coordinate the two classifiers for local model updates.Experimental results on Cifar10 and Cifar100 datasets demonstrate that FedTC performs better on heterogeneous data than current studies,such as FedAvg,FedPer,and local training,achieving a maximum improvement of 27.95%in model classification test accuracy compared to FedAvg.展开更多
While recommendation plays an increasingly critical role in our living, study, work, and entertainment, the recommendations we receive are often for irrelevant, duplicate, or uninteresting products and ser- vices. A c...While recommendation plays an increasingly critical role in our living, study, work, and entertainment, the recommendations we receive are often for irrelevant, duplicate, or uninteresting products and ser- vices. A critical reason for such bad recommendations lies in the intrinsic assumption that recommend- ed users and items are independent and identically distributed (liD) in existing theories and systems. Another phenomenon is that, while tremendous efforts have been made to model specific aspects of users or items, the overall user and item characteristics and their non-IIDness have been overlooked. In this paper, the non-liD nature and characteristics of recommendation are discussed, followed by the non-liD theoretical framework in order to build a deep and comprehensive understanding of the in- trinsic nature of recommendation problems, from the perspective of both couplings and heterogeneity. This non-liD recommendation research triggers the paradigm shift from lid to non-liD recommendation research and can hopefully deliver informed, relevant, personalized, and actionable recommendations. It creates exciting new directions and fundamental solutions to address various complexities including cold-start, sparse data-based, cross-domain, group-based, and shilling attack-related issues.展开更多
The error patterns of a wireless channel can be represented by a binary sequence of ones(burst) and zeros(run),which is referred to as a trace.Recent surveys have shown that the run length distribution of a wireless c...The error patterns of a wireless channel can be represented by a binary sequence of ones(burst) and zeros(run),which is referred to as a trace.Recent surveys have shown that the run length distribution of a wireless channel is an intrinsically heavy-tailed distribution.Analytical models to characterize such features have to deal with the trade-off between complexity and accuracy.In this paper,we use an independent but not identically distributed(inid) stochastic process to characterize such channel behavior and show how to parameterize the inid bit error model on the basis of a trace.The proposed model has merely two parameters both having intuitive meanings and can be easily figured out from a trace.Compared with chaotic maps,the inid bit error model is simple for practical use but can still be deprived from heavy-tailed distribution in theory.Simulation results demonstrate that the inid model can match the trace,but with fewer parameters.We then propose an improvement on the inid model to capture the 'bursty' nature of channel errors,described by burst length distribution.Our theoretical analysis is supported by an experimental evaluation.展开更多
The influence of non-Independent Identically Distribution(non-IID)data on Federated Learning(FL)has been a serious concern.Clustered Federated Learning(CFL)is an emerging approach for reducing the impact of non-IID da...The influence of non-Independent Identically Distribution(non-IID)data on Federated Learning(FL)has been a serious concern.Clustered Federated Learning(CFL)is an emerging approach for reducing the impact of non-IID data,which employs the client similarity calculated by relevant metrics for clustering.Unfortunately,the existing CFL methods only pursue a single accuracy improvement,but ignore the convergence rate.Additionlly,the designed client selection strategy will affect the clustering results.Finally,traditional semi-supervised learning changes the distribution of data on clients,resulting in higher local costs and undesirable performance.In this paper,we propose a novel CFL method named ASCFL,which selects clients to participate in training and can dynamically adjust the balance between accuracy and convergence speed with datasets consisting of labeled and unlabeled data.To deal with unlabeled data,the prediction labels strategy predicts labels by encoders.The client selection strategy is to improve accuracy and reduce overhead by selecting clients with higher losses participating in the current round.What is more,the similarity-based clustering strategy uses a new indicator to measure the similarity between clients.Experimental results show that ASCFL has certain advantages in model accuracy and convergence speed over the three state-of-the-art methods with two popular datasets.展开更多
We investigate three kinds of strong laws of large numbers for capacities with a new notion of independently and identically distributed(IID) random variables for sub-linear expectations initiated by Peng.It turns out...We investigate three kinds of strong laws of large numbers for capacities with a new notion of independently and identically distributed(IID) random variables for sub-linear expectations initiated by Peng.It turns out that these theorems are natural and fairly neat extensions of the classical Kolmogorov's strong law of large numbers to the case where probability measures are no longer additive. An important feature of these strong laws of large numbers is to provide a frequentist perspective on capacities.展开更多
This paper deals with strong laws of large numbers for sublinear expectation under controlled 1st moment condition. For a sequence of independent random variables,the author obtains a strong law of large numbers under...This paper deals with strong laws of large numbers for sublinear expectation under controlled 1st moment condition. For a sequence of independent random variables,the author obtains a strong law of large numbers under conditions that there is a control random variable whose 1st moment for sublinear expectation is finite. By discussing the relation between sublinear expectation and Choquet expectation, for a sequence of i.i.d random variables, the author illustrates that only the finiteness of uniform 1st moment for sublinear expectation cannot ensure the validity of the strong law of large numbers which in turn reveals that our result does make sense.展开更多
A batch Markov arrival process(BMAP) X^*=(N, J) is a 2-dimensional Markov process with two components, one is the counting process N and the other one is the phase process J. It is proved that the phase process i...A batch Markov arrival process(BMAP) X^*=(N, J) is a 2-dimensional Markov process with two components, one is the counting process N and the other one is the phase process J. It is proved that the phase process is a time-homogeneous Markov chain with a finite state-space, or for short, Markov chain. In this paper,a new and inverse problem is proposed firstly: given a Markov chain J, can we deploy a process N such that the 2-dimensional process X^*=(N, J) is a BMAP? The process X^*=(N, J) is said to be an adjoining BMAP for the Markov chain J. For a given Markov chain the adjoining processes exist and they are not unique. Two kinds of adjoining BMAPs have been constructed. One is the BMAPs with fixed constant batches, the other one is the BMAPs with independent and identically distributed(i.i.d) random batches. The method we used in this paper is not the usual matrix-analytic method of studying BMAP, it is a path-analytic method. We constructed directly sample paths of adjoining BMAPs. The expressions of characteristic(D_k, k = 0, 1, 2· · ·)and transition probabilities of the adjoining BMAP are obtained by the density matrix Q of the given Markov chain J. Moreover, we obtained two frontal Theorems. We present these expressions in the first time.展开更多
In this paper,an exponential inequality for weighted sums of identically distributed NOD (negatively orthant dependent) random variables is established,by which we obtain the almost sure convergence rate of which re...In this paper,an exponential inequality for weighted sums of identically distributed NOD (negatively orthant dependent) random variables is established,by which we obtain the almost sure convergence rate of which reaches the available one for independent random variables in terms of Berstein type inequality. As application,we obtain the relevant exponential inequality for Priestley-Chao estimator of nonparametric regression estimate under NOD samples,from which the strong consistency rate is also obtained.展开更多
In radar target detection, an optimum processor needs to automatically adapt its weights to the environment change. Conventionally, the optimum weights are obtained by substantial independently and identically distrib...In radar target detection, an optimum processor needs to automatically adapt its weights to the environment change. Conventionally, the optimum weights are obtained by substantial independently and identically distributed (i.i.d.) interference samplings, which is not always realistic in an inhomogeneous clutter background of airborne radar. The lack of i.i.d. samplings will inevitably lead to performance deterioration for optimum processing. In this paper, a novel parametric adaptive processing method is proposed for airborne radar target detection based on the modified Doppler distributed clutter (DDC) model with contribution of clutter's internal motion. It is different from the conventional methods in that the adaptive weights are determined by two parameters of DDC model, i.e., angular center and spread. A low-complexity nonlinear operators approach is also proposed to estimate these parameters. Simulation and performance analysis are also provided to show that the proposed method can remarkably reduce the dependence of i.i.d. samplings and it is computationally efficient for practical use.展开更多
Let(Xn)n≥1 be a sequence of independent identically distributed(i.i.d.) positive random variables with EX1 = μ,Var(X1) = σ2.In the present paper,we establish the moderate deviations principle for the products of pa...Let(Xn)n≥1 be a sequence of independent identically distributed(i.i.d.) positive random variables with EX1 = μ,Var(X1) = σ2.In the present paper,we establish the moderate deviations principle for the products of partial sums(πnk=1Sk/n!μn)1/(γbn√(2n))1where γ = σ/μ denotes the coefficient of variation and(bn) is the moderate deviations scale.展开更多
In this paper,some laws of large numbers are established for random variables that satisfy the Pareto distribution,so that the relevant conclusions in the traditional probability space are extended to the sub-linear e...In this paper,some laws of large numbers are established for random variables that satisfy the Pareto distribution,so that the relevant conclusions in the traditional probability space are extended to the sub-linear expectation space.Based on the Pareto distribution,we obtain the weak law of large numbers and strong law of large numbers of the weighted sum of some independent random variable sequences.展开更多
For a sampled-data control system with nonuniform sampling, the sampling interval sequence, which is continuously distributed in a given interval, is described as a multiple independent and identically distributed (i....For a sampled-data control system with nonuniform sampling, the sampling interval sequence, which is continuously distributed in a given interval, is described as a multiple independent and identically distributed (i.i.d.) process. With this process, the closed-loop system is transformed into an asynchronous dynamical impulsive model with input delays. Sufficient conditions for the closed-loop mean-square exponential stability are presented in terms of linear matrix inequalities (LMIs), in which the relation between the nonuniform sampling and the mean-square exponential stability of the closed-loop system is explicitly established. Based on the stability conditions, the controller design method is given, which is further formulated as a convex optimization problem with LMI constraints. Numerical examples and experiment results are given to show the effectiveness and the advantages of the theoretical results.展开更多
In this paper,we consider a class of nonlinear regression problems without the assumption of being independent and identically distributed.We propose a correspondent mini-max problem for nonlinear regression and give ...In this paper,we consider a class of nonlinear regression problems without the assumption of being independent and identically distributed.We propose a correspondent mini-max problem for nonlinear regression and give a numerical algorithm.Such an algorithm can be applied in regression and machine learning problems,and yields better results than traditional least squares and machine learning methods.展开更多
基金funded by Shenzhen Basic Research(Key Project)(No.JCYJ20200109113405927)Shenzhen Stable Supporting Program(General Project)(No.GXWD20201230155427003-20200821160539001)+1 种基金Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies(2022B1212010005)Peng Cheng Laboratory Project(Grant No.PCL2021A02),Ministry of Education’s Collaborative Education Project with Industry Cooperation(No.22077141140831).
文摘Centralized training of deep learning models poses privacy risks that hinder their deployment.Federated learning(FL)has emerged as a solution to address these risks,allowing multiple clients to train deep learning models collaborativelywithout sharing rawdata.However,FL is vulnerable to the impact of heterogeneous distributed data,which weakens convergence stability and suboptimal performance of the trained model on local data.This is due to the discarding of the old local model at each round of training,which results in the loss of personalized information in the model critical for maintaining model accuracy and ensuring robustness.In this paper,we propose FedTC,a personalized federated learning method with two classifiers that can retain personalized information in the local model and improve the model’s performance on local data.FedTC divides the model into two parts,namely,the extractor and the classifier,where the classifier is the last layer of the model,and the extractor consists of other layers.The classifier in the local model is always retained to ensure that the personalized information is not lost.After receiving the global model,the local extractor is overwritten by the globalmodel’s extractor,and the classifier of the globalmodel serves as anadditional classifier of the localmodel toguide local training.The FedTCintroduces a two-classifier training strategy to coordinate the two classifiers for local model updates.Experimental results on Cifar10 and Cifar100 datasets demonstrate that FedTC performs better on heterogeneous data than current studies,such as FedAvg,FedPer,and local training,achieving a maximum improvement of 27.95%in model classification test accuracy compared to FedAvg.
文摘While recommendation plays an increasingly critical role in our living, study, work, and entertainment, the recommendations we receive are often for irrelevant, duplicate, or uninteresting products and ser- vices. A critical reason for such bad recommendations lies in the intrinsic assumption that recommend- ed users and items are independent and identically distributed (liD) in existing theories and systems. Another phenomenon is that, while tremendous efforts have been made to model specific aspects of users or items, the overall user and item characteristics and their non-IIDness have been overlooked. In this paper, the non-liD nature and characteristics of recommendation are discussed, followed by the non-liD theoretical framework in order to build a deep and comprehensive understanding of the in- trinsic nature of recommendation problems, from the perspective of both couplings and heterogeneity. This non-liD recommendation research triggers the paradigm shift from lid to non-liD recommendation research and can hopefully deliver informed, relevant, personalized, and actionable recommendations. It creates exciting new directions and fundamental solutions to address various complexities including cold-start, sparse data-based, cross-domain, group-based, and shilling attack-related issues.
基金Project supported by the National Natural Science Foundationof China (Nos. 61103010,61103190,and 60803100)the National Basic Research Program (973) of China (No. 2012CB933500)the High-Tech R&D Program (863) of China (No.2012AA011001)
文摘The error patterns of a wireless channel can be represented by a binary sequence of ones(burst) and zeros(run),which is referred to as a trace.Recent surveys have shown that the run length distribution of a wireless channel is an intrinsically heavy-tailed distribution.Analytical models to characterize such features have to deal with the trade-off between complexity and accuracy.In this paper,we use an independent but not identically distributed(inid) stochastic process to characterize such channel behavior and show how to parameterize the inid bit error model on the basis of a trace.The proposed model has merely two parameters both having intuitive meanings and can be easily figured out from a trace.Compared with chaotic maps,the inid bit error model is simple for practical use but can still be deprived from heavy-tailed distribution in theory.Simulation results demonstrate that the inid model can match the trace,but with fewer parameters.We then propose an improvement on the inid model to capture the 'bursty' nature of channel errors,described by burst length distribution.Our theoretical analysis is supported by an experimental evaluation.
基金supported by the National Key Research and Development Program of China(No.2019YFC1520904)the National Natural Science Foundation of China(No.61973250).
文摘The influence of non-Independent Identically Distribution(non-IID)data on Federated Learning(FL)has been a serious concern.Clustered Federated Learning(CFL)is an emerging approach for reducing the impact of non-IID data,which employs the client similarity calculated by relevant metrics for clustering.Unfortunately,the existing CFL methods only pursue a single accuracy improvement,but ignore the convergence rate.Additionlly,the designed client selection strategy will affect the clustering results.Finally,traditional semi-supervised learning changes the distribution of data on clients,resulting in higher local costs and undesirable performance.In this paper,we propose a novel CFL method named ASCFL,which selects clients to participate in training and can dynamically adjust the balance between accuracy and convergence speed with datasets consisting of labeled and unlabeled data.To deal with unlabeled data,the prediction labels strategy predicts labels by encoders.The client selection strategy is to improve accuracy and reduce overhead by selecting clients with higher losses participating in the current round.What is more,the similarity-based clustering strategy uses a new indicator to measure the similarity between clients.Experimental results show that ASCFL has certain advantages in model accuracy and convergence speed over the three state-of-the-art methods with two popular datasets.
基金supported by National Natural Science Foundation of China(Grant No.11231005)
文摘We investigate three kinds of strong laws of large numbers for capacities with a new notion of independently and identically distributed(IID) random variables for sub-linear expectations initiated by Peng.It turns out that these theorems are natural and fairly neat extensions of the classical Kolmogorov's strong law of large numbers to the case where probability measures are no longer additive. An important feature of these strong laws of large numbers is to provide a frequentist perspective on capacities.
基金supported by the National Natural Science Foundation of China(Nos.11501325,11231005)
文摘This paper deals with strong laws of large numbers for sublinear expectation under controlled 1st moment condition. For a sequence of independent random variables,the author obtains a strong law of large numbers under conditions that there is a control random variable whose 1st moment for sublinear expectation is finite. By discussing the relation between sublinear expectation and Choquet expectation, for a sequence of i.i.d random variables, the author illustrates that only the finiteness of uniform 1st moment for sublinear expectation cannot ensure the validity of the strong law of large numbers which in turn reveals that our result does make sense.
基金Supported by the National Natural Science Foundation of China(No.11671132,11601147)Hunan Provincial Natural Science Foundation of China(No.16J3010)+1 种基金Philosophy and Social Science Foundation of Hunan Province(No.16YBA053)Key Scientific Research Project of Hunan Provincial Education Department(No.15A032)
文摘A batch Markov arrival process(BMAP) X^*=(N, J) is a 2-dimensional Markov process with two components, one is the counting process N and the other one is the phase process J. It is proved that the phase process is a time-homogeneous Markov chain with a finite state-space, or for short, Markov chain. In this paper,a new and inverse problem is proposed firstly: given a Markov chain J, can we deploy a process N such that the 2-dimensional process X^*=(N, J) is a BMAP? The process X^*=(N, J) is said to be an adjoining BMAP for the Markov chain J. For a given Markov chain the adjoining processes exist and they are not unique. Two kinds of adjoining BMAPs have been constructed. One is the BMAPs with fixed constant batches, the other one is the BMAPs with independent and identically distributed(i.i.d) random batches. The method we used in this paper is not the usual matrix-analytic method of studying BMAP, it is a path-analytic method. We constructed directly sample paths of adjoining BMAPs. The expressions of characteristic(D_k, k = 0, 1, 2· · ·)and transition probabilities of the adjoining BMAP are obtained by the density matrix Q of the given Markov chain J. Moreover, we obtained two frontal Theorems. We present these expressions in the first time.
基金Supported by the National Natural Science Foundation of China ( 11061007)
文摘In this paper,an exponential inequality for weighted sums of identically distributed NOD (negatively orthant dependent) random variables is established,by which we obtain the almost sure convergence rate of which reaches the available one for independent random variables in terms of Berstein type inequality. As application,we obtain the relevant exponential inequality for Priestley-Chao estimator of nonparametric regression estimate under NOD samples,from which the strong consistency rate is also obtained.
文摘In radar target detection, an optimum processor needs to automatically adapt its weights to the environment change. Conventionally, the optimum weights are obtained by substantial independently and identically distributed (i.i.d.) interference samplings, which is not always realistic in an inhomogeneous clutter background of airborne radar. The lack of i.i.d. samplings will inevitably lead to performance deterioration for optimum processing. In this paper, a novel parametric adaptive processing method is proposed for airborne radar target detection based on the modified Doppler distributed clutter (DDC) model with contribution of clutter's internal motion. It is different from the conventional methods in that the adaptive weights are determined by two parameters of DDC model, i.e., angular center and spread. A low-complexity nonlinear operators approach is also proposed to estimate these parameters. Simulation and performance analysis are also provided to show that the proposed method can remarkably reduce the dependence of i.i.d. samplings and it is computationally efficient for practical use.
基金supported by National Natural Science Foundation of China (Grant No.11001077)
文摘Let(Xn)n≥1 be a sequence of independent identically distributed(i.i.d.) positive random variables with EX1 = μ,Var(X1) = σ2.In the present paper,we establish the moderate deviations principle for the products of partial sums(πnk=1Sk/n!μn)1/(γbn√(2n))1where γ = σ/μ denotes the coefficient of variation and(bn) is the moderate deviations scale.
基金AcknowledgmentssThe authors thank the National Natural Science Foundation of China(Grant No.12061028)Guangxi Natural Science Foundation Joint Incubation Project(Grant No.2018GXNSFAA294131)+1 种基金Guangxi Natural Science Foundation(Grant No.2018G XNSFAA281011)Innovation Project of Guangxi Graduate Education(Grant No.YCSW2020175)for their financial support。
文摘In this paper,some laws of large numbers are established for random variables that satisfy the Pareto distribution,so that the relevant conclusions in the traditional probability space are extended to the sub-linear expectation space.Based on the Pareto distribution,we obtain the weak law of large numbers and strong law of large numbers of the weighted sum of some independent random variable sequences.
基金supported by National Natural Science Foundation of China (Nos.61104105,U0735003 and 60974047)Natural Science Foundation of Guangdong Province of China (No.9451009001002702)
文摘For a sampled-data control system with nonuniform sampling, the sampling interval sequence, which is continuously distributed in a given interval, is described as a multiple independent and identically distributed (i.i.d.) process. With this process, the closed-loop system is transformed into an asynchronous dynamical impulsive model with input delays. Sufficient conditions for the closed-loop mean-square exponential stability are presented in terms of linear matrix inequalities (LMIs), in which the relation between the nonuniform sampling and the mean-square exponential stability of the closed-loop system is explicitly established. Based on the stability conditions, the controller design method is given, which is further formulated as a convex optimization problem with LMI constraints. Numerical examples and experiment results are given to show the effectiveness and the advantages of the theoretical results.
文摘In this paper,we consider a class of nonlinear regression problems without the assumption of being independent and identically distributed.We propose a correspondent mini-max problem for nonlinear regression and give a numerical algorithm.Such an algorithm can be applied in regression and machine learning problems,and yields better results than traditional least squares and machine learning methods.