Why heavily parameterized neural networks(NNs) do not overfit the data is an important long standing open question. We propose a phenomenological model of the NN training to explain this non-overfitting puzzle. Our li...Why heavily parameterized neural networks(NNs) do not overfit the data is an important long standing open question. We propose a phenomenological model of the NN training to explain this non-overfitting puzzle. Our linear frequency principle(LFP) model accounts for a key dynamical feature of NNs: they learn low frequencies first, irrespective of microscopic details. Theory based on our LFP model shows that low frequency dominance of target functions is the key condition for the non-overfitting of NNs and is verified by experiments. Furthermore,through an ideal two-layer NN, we unravel how detailed microscopic NN training dynamics statistically gives rise to an LFP model with quantitative prediction power.展开更多
In this paper,we propose a machine learning approach via model-operatordata network(MOD-Net)for solving PDEs.A MOD-Net is driven by a model to solve PDEs based on operator representationwith regularization fromdata.Fo...In this paper,we propose a machine learning approach via model-operatordata network(MOD-Net)for solving PDEs.A MOD-Net is driven by a model to solve PDEs based on operator representationwith regularization fromdata.For linear PDEs,we use a DNN to parameterize the Green’s function and obtain the neural operator to approximate the solution according to the Green’s method.To train the DNN,the empirical risk consists of the mean squared loss with the least square formulation or the variational formulation of the governing equation and boundary conditions.For complicated problems,the empirical risk also includes a fewlabels,which are computed on coarse grid points with cheap computation cost and significantly improves the model accuracy.Intuitively,the labeled dataset works as a regularization in addition to the model constraints.The MOD-Net solves a family of PDEs rather than a specific one and is much more efficient than original neural operator because few expensive labels are required.We numerically show MOD-Net is very efficient in solving Poisson equation and one-dimensional radiative transfer equation.For nonlinear PDEs,the nonlinear MOD-Net can be similarly used as an ansatz for solving nonlinear PDEs,exemplified by solving several nonlinear PDE problems,such as the Burgers equation.展开更多
基金Supported by the National Key R&D Program of China(Grant No.2019YFA0709503)the Shanghai Sailing Program+3 种基金the Natural Science Foundation of Shanghai(Grant No.20ZR1429000)the National Natural Science Foundation of China(Grant No.62002221)Shanghai Municipal of Science and Technology Project(Grant No.20JC1419500)the HPC of School of Mathematical Sciences at Shanghai Jiao Tong University。
文摘Why heavily parameterized neural networks(NNs) do not overfit the data is an important long standing open question. We propose a phenomenological model of the NN training to explain this non-overfitting puzzle. Our linear frequency principle(LFP) model accounts for a key dynamical feature of NNs: they learn low frequencies first, irrespective of microscopic details. Theory based on our LFP model shows that low frequency dominance of target functions is the key condition for the non-overfitting of NNs and is verified by experiments. Furthermore,through an ideal two-layer NN, we unravel how detailed microscopic NN training dynamics statistically gives rise to an LFP model with quantitative prediction power.
文摘This note provides a correction of a missing weight constant in the MscaleDNN formula and some comments on the performance of the corrected algorithm.
基金sponsored by the National Key R&D Program of China Grant No.2019YFA0709503(Z.X.)and No.2020YFA0712000(Z.M.)the Shanghai Sailing Program(Z.X.)+9 种基金the Natural Science Foundation of Shanghai Grant No.20ZR1429000(Z.X.)the National Natural Science Foundation of China Grant No.62002221(Z.X.)the National Natural Science Foundation of China Grant No.12101401(T.L.)the National Natural Science Foundation of China Grant No.12101402(Y.Z.)Shanghai Municipal of Science and Technology Project Grant No.20JC1419500(Y.Z.)the Lingang Laboratory Grant No.LG-QS-202202-08(Y.Z.)the National Natural Science Foundation of China Grant No.12031013(Z.M.)Shanghai Municipal of Science and Technology Major Project No.2021SHZDZX0102the HPC of School of Mathematical Sciencesthe Student Innovation Center at Shanghai Jiao Tong University.
文摘In this paper,we propose a machine learning approach via model-operatordata network(MOD-Net)for solving PDEs.A MOD-Net is driven by a model to solve PDEs based on operator representationwith regularization fromdata.For linear PDEs,we use a DNN to parameterize the Green’s function and obtain the neural operator to approximate the solution according to the Green’s method.To train the DNN,the empirical risk consists of the mean squared loss with the least square formulation or the variational formulation of the governing equation and boundary conditions.For complicated problems,the empirical risk also includes a fewlabels,which are computed on coarse grid points with cheap computation cost and significantly improves the model accuracy.Intuitively,the labeled dataset works as a regularization in addition to the model constraints.The MOD-Net solves a family of PDEs rather than a specific one and is much more efficient than original neural operator because few expensive labels are required.We numerically show MOD-Net is very efficient in solving Poisson equation and one-dimensional radiative transfer equation.For nonlinear PDEs,the nonlinear MOD-Net can be similarly used as an ansatz for solving nonlinear PDEs,exemplified by solving several nonlinear PDE problems,such as the Burgers equation.