In this paper,Let M_(n)denote the maximum of logarithmic general error distribution with parameter v≥1.Higher-order expansions for distributions of powered extremes M_(n)^(p)are derived under an optimal choice of nor...In this paper,Let M_(n)denote the maximum of logarithmic general error distribution with parameter v≥1.Higher-order expansions for distributions of powered extremes M_(n)^(p)are derived under an optimal choice of normalizing constants.It is shown that M_(n)^(p),when v=1,converges to the Frechet extreme value distribution at the rate of 1/n,and if v>1 then M_(n)^(p)converges to the Gumbel extreme value distribution at the rate of(loglogn)^(2)=(log n)^(1-1/v).展开更多
Company bankruptcies cost billions of dollars in losses to banks each year. Thus credit risk prediction is a critical part of a bank's loan approval decision process. Traditional financial models for credit risk pred...Company bankruptcies cost billions of dollars in losses to banks each year. Thus credit risk prediction is a critical part of a bank's loan approval decision process. Traditional financial models for credit risk prediction are no longer adequate for describing today's complex relationship between the financial health and potential bankruptcy of a company. In this work, a multiple classifier system (embedded in a multiple intelligent agent system) is proposed to predict the financial health of a company. In our model, each individual agent (classifier) makes a prediction on the likelihood of credit risk based on only partial information of the company. Each of the agents is an expert, but has limited knowledge (represented by features) about the company. The decisions of all agents are combined together to form a final credit risk prediction. Experiments show that our model out-performs other existing methods using the benchmarking Compustat American Corporations dataset.展开更多
In this work,we study gradient-based regularization methods for neural networks.We mainly focus on two regularization methods:the total variation and the Tikhonov regularization.Adding the regularization term to the t...In this work,we study gradient-based regularization methods for neural networks.We mainly focus on two regularization methods:the total variation and the Tikhonov regularization.Adding the regularization term to the training loss is equivalent to using neural networks to solve some variational problems,mostly in high dimensions in practical applications.We introduce a general framework to analyze the error between neural network solutions and true solutions to variational problems.The error consists of three parts:the approximation errors of neural networks,the quadrature errors of numerical integration,and the optimization error.We also apply the proposed framework to two-layer networks to derive a priori error estimate when the true solution belongs to the so-called Barron space.Moreover,we conduct some numerical experiments to show that neural networks can solve corresponding variational problems sufficiently well.The networks with gradient-based regularization are much more robust in image applications.展开更多
Explores the generalization error of fuzzy neural network, analyzes the reason for occurrence and presents the equation of calculating error by the confidence interval approach. In addition, a generalization error tra...Explores the generalization error of fuzzy neural network, analyzes the reason for occurrence and presents the equation of calculating error by the confidence interval approach. In addition, a generalization error transfering(GET) method of improving the generalization error is proposed. The simulation experimental results of heating furnance show that the GET scheme is efficient.展开更多
Support vector machines are originally designed for binary classification. How to effectively extend it for multi-class classification is still an on-going research issue. In this paper, we consider kernel machines wh...Support vector machines are originally designed for binary classification. How to effectively extend it for multi-class classification is still an on-going research issue. In this paper, we consider kernel machines which are natural extensions of multi-category support vector machines originally proposed by Crammer and Singer. Based on the algorithm stability, we obtain the generalization error bounds for the kernel machines proposed in the paper.展开更多
Logarithmic general error distribution is an extension of lognormal distribution. In this paper, with optimal norming constants the higher-order expansion of distribution of partial maximum of logarithmic general erro...Logarithmic general error distribution is an extension of lognormal distribution. In this paper, with optimal norming constants the higher-order expansion of distribution of partial maximum of logarithmic general error distribution is derived.展开更多
The task of using the machine learning to approximate the mapping x→Σi=1^d xi^(2)with Xi∈[-1,1]seems to be a trivial one.Given the knowledge of the separable structure of the function,one can design a sparse networ...The task of using the machine learning to approximate the mapping x→Σi=1^d xi^(2)with Xi∈[-1,1]seems to be a trivial one.Given the knowledge of the separable structure of the function,one can design a sparse network to represent the function very accurately,or even exactly.When such structural information is not available,and we may only use a dense neural network,the optimization procedure to find the sparse network embedded in the dense network is similar to finding the needle in a haystack,using a given number of samples of the function.We demonstrate that the cost(measured by sample complexity)of finding the needle is directly related to the Barron norm of the function.While only a small number of samples are needed to train a sparse network,the dense network trained with the same number of samples exhibits large test loss and a large generalization gap.To control the size of the generalization gap,we find that the use of the explicit regularization becomes increasingly more important as d increases.The numerically observed sample complexity with explicit regularization scales as G(d^(2.5)),which is in fact better than the theoretically predicted sample complexity that scales as 0(d^(4)).Without the explicit regularization(also called the implicit regularization),the numerically observed sample complexity is significantly higher and is close to 0(d^(4.5)).展开更多
The problem of estimating direction of arrivals (DOA) and Doppler frequency for many sources is considered in the presence of general array errors (such as amplitude and phase error of sensors, setting position error ...The problem of estimating direction of arrivals (DOA) and Doppler frequency for many sources is considered in the presence of general array errors (such as amplitude and phase error of sensors, setting position error of sensors). Adopting direct array manifold in a uniform circular array (UCA), the estimation of Doppler frequency can be obtained by DOA matrix. Based on analyzing the statistic characters of general array errors, the estimation of DOA can be obtained by Weight Total Least Squares. Numerical results illustrate that the estimator is robust to general array errors and show the capabilities of the estimator.展开更多
Let{Xn:n≥1}be a sequence of independent random variables with common general error distribution GED(v)with shape parameter v>0,and let Mn,r denote the r-th largest order statistics of X1,X2,...,Xn.With different n...Let{Xn:n≥1}be a sequence of independent random variables with common general error distribution GED(v)with shape parameter v>0,and let Mn,r denote the r-th largest order statistics of X1,X2,...,Xn.With different normalizing constants the distributional expansions and the uniform convergence rates of normalized powered order statistics|Mn,r|p are established.An alternative method is presented to estimate the probability of the r-th extremes.Numerical analyses are provided to support the main results.展开更多
A fairly comprehensive analysis is presented for the gradient descent dynamics for training two-layer neural network models in the situation when the parameters in both layers are updated.General initialization scheme...A fairly comprehensive analysis is presented for the gradient descent dynamics for training two-layer neural network models in the situation when the parameters in both layers are updated.General initialization schemes as well as general regimes for the network width and training data size are considered.In the overparametrized regime,it is shown that gradient descent dynamics can achieve zero training loss exponentially fast regardless of the quality of the labels.In addition,it is proved that throughout the training process the functions represented by the neural network model are uniformly close to those of a kernel method.For general values of the network width and training data size,sharp estimates of the generalization error are established for target functions in the appropriate reproducing kernel Hilbert space.展开更多
Semi-supervised learning has been of growing interest over the past few years and many methods have been proposed. Although various algorithms are provided to implement semi-supervised learning,there are still gaps in...Semi-supervised learning has been of growing interest over the past few years and many methods have been proposed. Although various algorithms are provided to implement semi-supervised learning,there are still gaps in our understanding of the dependence of generalization error on the numbers of labeled and unlabeled data. In this paper,we consider a graph-based semi-supervised classification algorithm and establish its generalization error bounds. Our results show the close relations between the generalization performance and the structural invariants of data graph.展开更多
With the remarkable empirical success of neural networks across diverse scientific disciplines,rigorous error and convergence analysis are also being developed and enriched.However,there has been little theoretical wo...With the remarkable empirical success of neural networks across diverse scientific disciplines,rigorous error and convergence analysis are also being developed and enriched.However,there has been little theoretical work focusing on neural networks in solving interface problems.In this paper,we perform a convergence analysis of physics-informed neural networks(PINNs)for solving second-order elliptic interface problems.Specifically,we consider PINNs with domain decomposition technologies and introduce gradient-enhanced strategies on the interfaces to deal with boundary and interface jump conditions.It is shown that the neural network sequence obtained by minimizing a Lipschitz regularized loss function converges to the unique solution to the interface problem in H2 as the number of samples increases.Numerical experiments are provided to demonstrate our theoretical analysis.展开更多
Closed production systems,such as plant factories and vertical farms,have emerged to ensure a sustainable supply of fresh food,to cope with the increasing consumption of natural resource for the growing population.In ...Closed production systems,such as plant factories and vertical farms,have emerged to ensure a sustainable supply of fresh food,to cope with the increasing consumption of natural resource for the growing population.In a plant factory,a microclimate model is one of the direct control components of a whole system.In order to better realize the dynamic regulation for the microclimate model,energy-saving and consumption reduction,it is necessary to optimize the environmental parameters in the plant factory,and thereby to determine the influencing factors of atmosphere control systems.Therefore,this study aims to identify accurate microclimate models,and further to predict temperature change based on the experimental data,using the classification and regression trees(CART)algorithm.A random forest theory was used to represent the temperature control system.A mechanism model of the temperature control system was proposed to improve the performance of the plant factories.In terms of energy efficiency,the main influencing factors on temperature change in the plant factories were obtained,including the temperature and air volume flow of the temperature control device,as well as the internal relative humidity.The generalization error of the prediction model can reach 0.0907.The results demonstrated that the proposed model can present the quantitative relationship and prediction function.This study can provide a reference for the design of high-precision environmental control systems in plant factories.展开更多
文摘In this paper,Let M_(n)denote the maximum of logarithmic general error distribution with parameter v≥1.Higher-order expansions for distributions of powered extremes M_(n)^(p)are derived under an optimal choice of normalizing constants.It is shown that M_(n)^(p),when v=1,converges to the Frechet extreme value distribution at the rate of 1/n,and if v>1 then M_(n)^(p)converges to the Gumbel extreme value distribution at the rate of(loglogn)^(2)=(log n)^(1-1/v).
文摘Company bankruptcies cost billions of dollars in losses to banks each year. Thus credit risk prediction is a critical part of a bank's loan approval decision process. Traditional financial models for credit risk prediction are no longer adequate for describing today's complex relationship between the financial health and potential bankruptcy of a company. In this work, a multiple classifier system (embedded in a multiple intelligent agent system) is proposed to predict the financial health of a company. In our model, each individual agent (classifier) makes a prediction on the likelihood of credit risk based on only partial information of the company. Each of the agents is an expert, but has limited knowledge (represented by features) about the company. The decisions of all agents are combined together to form a final credit risk prediction. Experiments show that our model out-performs other existing methods using the benchmarking Compustat American Corporations dataset.
基金partially supported by the National Science Foundation of China and Hong Kong RGC Joint Research Scheme(NSFC/RGC 11961160718)the fund of the Guangdong Provincial Key Laboratory of Computational Science and Material Design(No.2019B030301001)+1 种基金supported by the National Science Foundation of China(NSFC-11871264)the Shenzhen Natural Science Fund(RCJC20210609103819018).
文摘In this work,we study gradient-based regularization methods for neural networks.We mainly focus on two regularization methods:the total variation and the Tikhonov regularization.Adding the regularization term to the training loss is equivalent to using neural networks to solve some variational problems,mostly in high dimensions in practical applications.We introduce a general framework to analyze the error between neural network solutions and true solutions to variational problems.The error consists of three parts:the approximation errors of neural networks,the quadrature errors of numerical integration,and the optimization error.We also apply the proposed framework to two-layer networks to derive a priori error estimate when the true solution belongs to the so-called Barron space.Moreover,we conduct some numerical experiments to show that neural networks can solve corresponding variational problems sufficiently well.The networks with gradient-based regularization are much more robust in image applications.
文摘Explores the generalization error of fuzzy neural network, analyzes the reason for occurrence and presents the equation of calculating error by the confidence interval approach. In addition, a generalization error transfering(GET) method of improving the generalization error is proposed. The simulation experimental results of heating furnance show that the GET scheme is efficient.
基金Supported in part by the Specialized Research Fund for the Doctoral Program of Higher Education under grant 20060512001.
文摘Support vector machines are originally designed for binary classification. How to effectively extend it for multi-class classification is still an on-going research issue. In this paper, we consider kernel machines which are natural extensions of multi-category support vector machines originally proposed by Crammer and Singer. Based on the algorithm stability, we obtain the generalization error bounds for the kernel machines proposed in the paper.
基金Supported by the National Natural Science Foundation of China(11171275)the Natural Science Foundation Project of CQ(cstc2012jj A00029)the Doctoral Grant of University of Shanghai for Science and Technology(BSQD201608)
文摘Logarithmic general error distribution is an extension of lognormal distribution. In this paper, with optimal norming constants the higher-order expansion of distribution of partial maximum of logarithmic general error distribution is derived.
基金the Department of Energy under Grant No.DE-SC0017867the CAMERA program(L.L.,J.Z.,L.Z.-N.)+1 种基金the Hong Kong Research Grant Council under Grant No.16303817(Y.Y.)We thank the Berkeley Research Computing(BRC)program at the University of California,Berkeley,and the Google Cloud Platform(GCP)for the computational resources.We thank Weinan E,Chao Ma,Lei Wu for pointing out the critical role of the path norm in understanding the numerical behavior of the generalization error,and thank Joan Bruna,Jiequn Han,Joonho Lee,Jianfeng Lu,Tengyu Ma,Lexing Ying for valuable discussions.
文摘The task of using the machine learning to approximate the mapping x→Σi=1^d xi^(2)with Xi∈[-1,1]seems to be a trivial one.Given the knowledge of the separable structure of the function,one can design a sparse network to represent the function very accurately,or even exactly.When such structural information is not available,and we may only use a dense neural network,the optimization procedure to find the sparse network embedded in the dense network is similar to finding the needle in a haystack,using a given number of samples of the function.We demonstrate that the cost(measured by sample complexity)of finding the needle is directly related to the Barron norm of the function.While only a small number of samples are needed to train a sparse network,the dense network trained with the same number of samples exhibits large test loss and a large generalization gap.To control the size of the generalization gap,we find that the use of the explicit regularization becomes increasingly more important as d increases.The numerically observed sample complexity with explicit regularization scales as G(d^(2.5)),which is in fact better than the theoretically predicted sample complexity that scales as 0(d^(4)).Without the explicit regularization(also called the implicit regularization),the numerically observed sample complexity is significantly higher and is close to 0(d^(4.5)).
文摘The problem of estimating direction of arrivals (DOA) and Doppler frequency for many sources is considered in the presence of general array errors (such as amplitude and phase error of sensors, setting position error of sensors). Adopting direct array manifold in a uniform circular array (UCA), the estimation of Doppler frequency can be obtained by DOA matrix. Based on analyzing the statistic characters of general array errors, the estimation of DOA can be obtained by Weight Total Least Squares. Numerical results illustrate that the estimator is robust to general array errors and show the capabilities of the estimator.
文摘Let{Xn:n≥1}be a sequence of independent random variables with common general error distribution GED(v)with shape parameter v>0,and let Mn,r denote the r-th largest order statistics of X1,X2,...,Xn.With different normalizing constants the distributional expansions and the uniform convergence rates of normalized powered order statistics|Mn,r|p are established.An alternative method is presented to estimate the probability of the r-th extremes.Numerical analyses are provided to support the main results.
基金supported by a gift to Princeton University from i Flytek and the Office of Naval Research(ONR)(Grant No.N00014-13-1-0338)。
文摘A fairly comprehensive analysis is presented for the gradient descent dynamics for training two-layer neural network models in the situation when the parameters in both layers are updated.General initialization schemes as well as general regimes for the network width and training data size are considered.In the overparametrized regime,it is shown that gradient descent dynamics can achieve zero training loss exponentially fast regardless of the quality of the labels.In addition,it is proved that throughout the training process the functions represented by the neural network model are uniformly close to those of a kernel method.For general values of the network width and training data size,sharp estimates of the generalization error are established for target functions in the appropriate reproducing kernel Hilbert space.
基金supported by National Natural Science Foundation of China (Grant No. 10771053)Specialized Research Foundation for the Doctoral Program of Higher Education of China (SRFDP) (Grant No. 20060512001)Natural Science Foundation of Hubei Province (Grant No. 2007ABA139)
文摘Semi-supervised learning has been of growing interest over the past few years and many methods have been proposed. Although various algorithms are provided to implement semi-supervised learning,there are still gaps in our understanding of the dependence of generalization error on the numbers of labeled and unlabeled data. In this paper,we consider a graph-based semi-supervised classification algorithm and establish its generalization error bounds. Our results show the close relations between the generalization performance and the structural invariants of data graph.
基金the National Natural Science Foundation of China(Grant Nos.11771435,22073110 and 12171466).
文摘With the remarkable empirical success of neural networks across diverse scientific disciplines,rigorous error and convergence analysis are also being developed and enriched.However,there has been little theoretical work focusing on neural networks in solving interface problems.In this paper,we perform a convergence analysis of physics-informed neural networks(PINNs)for solving second-order elliptic interface problems.Specifically,we consider PINNs with domain decomposition technologies and introduce gradient-enhanced strategies on the interfaces to deal with boundary and interface jump conditions.It is shown that the neural network sequence obtained by minimizing a Lipschitz regularized loss function converges to the unique solution to the interface problem in H2 as the number of samples increases.Numerical experiments are provided to demonstrate our theoretical analysis.
文摘Closed production systems,such as plant factories and vertical farms,have emerged to ensure a sustainable supply of fresh food,to cope with the increasing consumption of natural resource for the growing population.In a plant factory,a microclimate model is one of the direct control components of a whole system.In order to better realize the dynamic regulation for the microclimate model,energy-saving and consumption reduction,it is necessary to optimize the environmental parameters in the plant factory,and thereby to determine the influencing factors of atmosphere control systems.Therefore,this study aims to identify accurate microclimate models,and further to predict temperature change based on the experimental data,using the classification and regression trees(CART)algorithm.A random forest theory was used to represent the temperature control system.A mechanism model of the temperature control system was proposed to improve the performance of the plant factories.In terms of energy efficiency,the main influencing factors on temperature change in the plant factories were obtained,including the temperature and air volume flow of the temperature control device,as well as the internal relative humidity.The generalization error of the prediction model can reach 0.0907.The results demonstrated that the proposed model can present the quantitative relationship and prediction function.This study can provide a reference for the design of high-precision environmental control systems in plant factories.