The deferred correction(DeC)is an iterative procedure,characterized by increasing the accuracy at each iteration,which can be used to design numerical methods for systems of ODEs.The main advantage of such framework i...The deferred correction(DeC)is an iterative procedure,characterized by increasing the accuracy at each iteration,which can be used to design numerical methods for systems of ODEs.The main advantage of such framework is the automatic way of getting arbitrarily high order methods,which can be put in the Runge-Kutta(RK)form.The drawback is the larger computational cost with respect to the most used RK methods.To reduce such cost,in an explicit setting,we propose an efcient modifcation:we introduce interpolation processes between the DeC iterations,decreasing the computational cost associated to the low order ones.We provide the Butcher tableaux of the new modifed methods and we study their stability,showing that in some cases the computational advantage does not afect the stability.The fexibility of the novel modifcation allows nontrivial applications to PDEs and construction of adaptive methods.The good performances of the introduced methods are broadly tested on several benchmarks both in ODE and PDE contexts.展开更多
GPU computing is expected to play an integral part in all modern Exascale supercomputers.It is also expected that higher order Godunov schemes will make up about a significant fraction of the application mix on such s...GPU computing is expected to play an integral part in all modern Exascale supercomputers.It is also expected that higher order Godunov schemes will make up about a significant fraction of the application mix on such supercomputers.It is,therefore,very important to prepare the community of users of higher order schemes for hyperbolic PDEs for this emerging opportunity.Not every algorithm that is used in the space-time update of the solution of hyperbolic PDEs will take well to GPUs.However,we identify a small core of algorithms that take exceptionally well to GPU computing.Based on an analysis of available options,we have been able to identify weighted essentially non-oscillatory(WENO)algorithms for spatial reconstruction along with arbitrary derivative(ADER)algorithms for time extension followed by a corrector step as the winning three-part algorithmic combination.Even when a winning subset of algorithms has been identified,it is not clear that they will port seamlessly to GPUs.The low data throughput between CPU and GPU,as well as the very small cache sizes on modern GPUs,implies that we have to think through all aspects of the task of porting an application to GPUs.For that reason,this paper identifies the techniques and tricks needed for making a successful port of this very useful class of higher order algorithms to GPUs.Application codes face a further challenge—the GPU results need to be practically indistinguishable from the CPU results—in order for the legacy knowledge bases embedded in these applications codes to be preserved during the port of GPUs.This requirement often makes a complete code rewrite impossible.For that reason,it is safest to use an approach based on OpenACC directives,so that most of the code remains intact(as long as it was originally well-written).This paper is intended to be a one-stop shop for anyone seeking to make an OpenACC-based port of a higher order Godunov scheme to GPUs.We focus on three broad and high-impact areas where higher order Godunov schemes are used.The first area is computational fluid dynamics(CFD).The second is computational magnetohydrodynamics(MHD)which has an involution constraint that has to be mimetically preserved.The third is computational electrodynamics(CED)which has involution constraints and also extremely stiff source terms.Together,these three diverse uses of higher order Godunov methodology,cover many of the most important applications areas.In all three cases,we show that the optimal use of algorithms,techniques,and tricks,along with the use of OpenACC,yields superlative speedups on GPUs.As a bonus,we find a most remarkable and desirable result:some higher order schemes,with their larger operations count per zone,show better speedup than lower order schemes on GPUs.In other words,the GPU is an optimal stratagem for overcoming the higher computational complexities of higher order schemes.Several avenues for future improvement have also been identified.A scalability study is presented for a real-world application using GPUs and comparable numbers of high-end multicore CPUs.It is found that GPUs offer a substantial performance benefit over comparable number of CPUs,especially when all the methods designed in this paper are used.展开更多
This paper presents an efficient numerical technique for solving multi-term linear systems of fractional ordinary differential equations(FODEs)which have been widely used in modeling various phenomena in engineering a...This paper presents an efficient numerical technique for solving multi-term linear systems of fractional ordinary differential equations(FODEs)which have been widely used in modeling various phenomena in engineering and science.An approximate solution of the system is sought in the formof the finite series over the Müntz polynomials.By using the collocation procedure in the time interval,one gets the linear algebraic system for the coefficient of the expansion which can be easily solved numerically by a standard procedure.This technique also serves as the basis for solving the time-fractional partial differential equations(PDEs).The modified radial basis functions are used for spatial approximation of the solution.The collocation in the solution domain transforms the equation into a system of fractional ordinary differential equations similar to the one mentioned above.Several examples have verified the performance of the proposed novel technique with high accuracy and efficiency.展开更多
In this paper, we propose an accelerated search-extension method (ASEM) based on the interpolated coefficient finite element method, the search-extension method (SEM) and the two-grid method to obtain the multiple...In this paper, we propose an accelerated search-extension method (ASEM) based on the interpolated coefficient finite element method, the search-extension method (SEM) and the two-grid method to obtain the multiple solutions for semilinear elliptic equations. This strategy is not only successfully implemented to obtain multiple solutions for a class of semilinear elliptic boundary value problems, but also reduces the expensive computation greatly. The numerical results in I-D and 2-D cases will show the efficiency of our approach.展开更多
In this paper,the three-variable shifted Jacobi operational matrix of fractional derivatives is used together with the collocation method for numerical solution of threedimensional multi-term fractional-order PDEs wit...In this paper,the three-variable shifted Jacobi operational matrix of fractional derivatives is used together with the collocation method for numerical solution of threedimensional multi-term fractional-order PDEs with variable coefficients.The main characteristic behind this approach is that it reduces such problems to those of solving a system of algebraic equations which greatly simplifying the problem.The approximate solutions of nonlinear fractional PDEs with variable coefficients thus obtained by threevariable shifted Jacobi polynomials are compared with the exact solutions.Furthermore some theorems and lemmas are introduced to verify the convergence results of our algorithm.Lastly,several numerical examples are presented to test the superiority and efficiency of the proposed method.展开更多
This article will combine the finite element method, the interpolated coefficient finite element method, the eigenfunction expansion method, and the search-extension method to obtain the multiple solutions for semilin...This article will combine the finite element method, the interpolated coefficient finite element method, the eigenfunction expansion method, and the search-extension method to obtain the multiple solutions for semilinear elliptic equations. This strategy not only grently reduces the expensive computation, but also is successfully implemented to obtain multiple solutions for a class of semilinear elliptic boundary value problems with non-odd nonlinearity on some convex or nonconvex domains. Numerical solutions illustrated by their graphics for visualization will show the efficiency of the approach.展开更多
In this paper the authors consider the summability of formal solutions for some first order singular PDEs with irregular singularity. They prove that in this case the formal solutions will be divergent, but except a e...In this paper the authors consider the summability of formal solutions for some first order singular PDEs with irregular singularity. They prove that in this case the formal solutions will be divergent, but except a enumerable directions, the formal solutions are Borel summable.展开更多
Flocking refers to collective behavior of a large number of interacting entities,where the interactions between discrete individuals produce collective motion on the large scale.We employ an agent-based model to descr...Flocking refers to collective behavior of a large number of interacting entities,where the interactions between discrete individuals produce collective motion on the large scale.We employ an agent-based model to describe the microscopic dynamics of each individual in a flock,and use a fractional partial differential equation(fPDE)to model the evolution of macroscopic quantities of interest.The macroscopic models with phenomenological interaction functions are derived by applying the continuum hypothesis to the microscopic model.Instead of specifying the fPDEs with an ad hoc fractional order for nonlocal flocking dynamics,we learn the effective nonlocal influence function in fPDEs directly from particle trajectories generated by the agent-based simulations.We demonstrate how the learning framework is used to connect the discrete agent-based model to the continuum fPDEs in one-and two-dimensional nonlocal flocking dynamics.In particular,a Cucker-Smale particle model is employed to describe the microscale dynamics of each individual,while Euler equations with nonlocal interaction terms are used to compute the evolution of macroscale quantities.The trajectories generated by the particle simulations mimic the field data of tracking logs that can be obtained experimentally.They can be used to learn the fractional order of the influence function using a Gaussian process regression model implemented with the Bayesian optimization.We show in one-and two-dimensional benchmarks that the numerical solution of the learned Euler equations solved by the finite volume scheme can yield correct density distributions consistent with the collective behavior of the agent-based system solved by the particle method.The proposed method offers new insights into how to scale the discrete agent-based models to the continuum-based PDE models,and could serve as a paradigm on extracting effective governing equations for nonlocal flocking dynamics directly from particle trajectories.展开更多
This paper examines a class of involution-constrained PDEs where some part of the PDE system evolves a vector field whose curl remains zero or grows in proportion to specified source terms.Such PDEs are referred to as...This paper examines a class of involution-constrained PDEs where some part of the PDE system evolves a vector field whose curl remains zero or grows in proportion to specified source terms.Such PDEs are referred to as curl-free or curl-preserving,respectively.They arise very frequently in equations for hyperelasticity and compressible multiphase flow,in certain formulations of general relativity and in the numerical solution of Schrödinger’s equation.Experience has shown that if nothing special is done to account for the curl-preserving vector field,it can blow up in a finite amount of simulation time.In this paper,we catalogue a class of DG-like schemes for such PDEs.To retain the globally curl-free or curl-preserving constraints,the components of the vector field,as well as their higher moments,must be collocated at the edges of the mesh.They are updated using potentials collocated at the vertices of the mesh.The resulting schemes:(i)do not blow up even after very long integration times,(ii)do not need any special cleaning treatment,(iii)can oper-ate with large explicit timesteps,(iv)do not require the solution of an elliptic system and(v)can be extended to higher orders using DG-like methods.The methods rely on a spe-cial curl-preserving reconstruction and they also rely on multidimensional upwinding.The Galerkin projection,highly crucial to the design of a DG method,is now conducted at the edges of the mesh and yields a weak form update that uses potentials obtained at the verti-ces of the mesh with the help of a multidimensional Riemann solver.A von Neumann sta-bility analysis of the curl-preserving methods is conducted and the limiting CFL numbers of this entire family of methods are catalogued in this work.The stability analysis confirms that with the increasing order of accuracy,our novel curl-free methods have superlative phase accuracy while substantially reducing dissipation.We also show that PNPM-like methods,which only evolve the lower moments while reconstructing the higher moments,retain much of the excellent wave propagation characteristics of the DG-like methods while offering a much larger CFL number and lower computational complexity.The quadratic energy preservation of these methods is also shown to be excellent,especially at higher orders.The methods are also shown to be curl-preserving over long integration times.展开更多
文摘The deferred correction(DeC)is an iterative procedure,characterized by increasing the accuracy at each iteration,which can be used to design numerical methods for systems of ODEs.The main advantage of such framework is the automatic way of getting arbitrarily high order methods,which can be put in the Runge-Kutta(RK)form.The drawback is the larger computational cost with respect to the most used RK methods.To reduce such cost,in an explicit setting,we propose an efcient modifcation:we introduce interpolation processes between the DeC iterations,decreasing the computational cost associated to the low order ones.We provide the Butcher tableaux of the new modifed methods and we study their stability,showing that in some cases the computational advantage does not afect the stability.The fexibility of the novel modifcation allows nontrivial applications to PDEs and construction of adaptive methods.The good performances of the introduced methods are broadly tested on several benchmarks both in ODE and PDE contexts.
基金support via the NSF grants NSF-19-04774,NSF-AST-2009776,NASA-2020-1241the NASA grant 80NSSC22K0628。
文摘GPU computing is expected to play an integral part in all modern Exascale supercomputers.It is also expected that higher order Godunov schemes will make up about a significant fraction of the application mix on such supercomputers.It is,therefore,very important to prepare the community of users of higher order schemes for hyperbolic PDEs for this emerging opportunity.Not every algorithm that is used in the space-time update of the solution of hyperbolic PDEs will take well to GPUs.However,we identify a small core of algorithms that take exceptionally well to GPU computing.Based on an analysis of available options,we have been able to identify weighted essentially non-oscillatory(WENO)algorithms for spatial reconstruction along with arbitrary derivative(ADER)algorithms for time extension followed by a corrector step as the winning three-part algorithmic combination.Even when a winning subset of algorithms has been identified,it is not clear that they will port seamlessly to GPUs.The low data throughput between CPU and GPU,as well as the very small cache sizes on modern GPUs,implies that we have to think through all aspects of the task of porting an application to GPUs.For that reason,this paper identifies the techniques and tricks needed for making a successful port of this very useful class of higher order algorithms to GPUs.Application codes face a further challenge—the GPU results need to be practically indistinguishable from the CPU results—in order for the legacy knowledge bases embedded in these applications codes to be preserved during the port of GPUs.This requirement often makes a complete code rewrite impossible.For that reason,it is safest to use an approach based on OpenACC directives,so that most of the code remains intact(as long as it was originally well-written).This paper is intended to be a one-stop shop for anyone seeking to make an OpenACC-based port of a higher order Godunov scheme to GPUs.We focus on three broad and high-impact areas where higher order Godunov schemes are used.The first area is computational fluid dynamics(CFD).The second is computational magnetohydrodynamics(MHD)which has an involution constraint that has to be mimetically preserved.The third is computational electrodynamics(CED)which has involution constraints and also extremely stiff source terms.Together,these three diverse uses of higher order Godunov methodology,cover many of the most important applications areas.In all three cases,we show that the optimal use of algorithms,techniques,and tricks,along with the use of OpenACC,yields superlative speedups on GPUs.As a bonus,we find a most remarkable and desirable result:some higher order schemes,with their larger operations count per zone,show better speedup than lower order schemes on GPUs.In other words,the GPU is an optimal stratagem for overcoming the higher computational complexities of higher order schemes.Several avenues for future improvement have also been identified.A scalability study is presented for a real-world application using GPUs and comparable numbers of high-end multicore CPUs.It is found that GPUs offer a substantial performance benefit over comparable number of CPUs,especially when all the methods designed in this paper are used.
基金funded by the National Key Research and Development Program of China(No.2021YFB2600704)the National Natural Science Foundation of China(No.52171272)the Significant Science and Technology Project of the Ministry of Water Resources of China(No.SKS-2022112).
文摘This paper presents an efficient numerical technique for solving multi-term linear systems of fractional ordinary differential equations(FODEs)which have been widely used in modeling various phenomena in engineering and science.An approximate solution of the system is sought in the formof the finite series over the Müntz polynomials.By using the collocation procedure in the time interval,one gets the linear algebraic system for the coefficient of the expansion which can be easily solved numerically by a standard procedure.This technique also serves as the basis for solving the time-fractional partial differential equations(PDEs).The modified radial basis functions are used for spatial approximation of the solution.The collocation in the solution domain transforms the equation into a system of fractional ordinary differential equations similar to the one mentioned above.Several examples have verified the performance of the proposed novel technique with high accuracy and efficiency.
基金supported by the National Natural Science Foundation of China (10571053, 10871066, 10811120282)Programme for New Century Excellent Talents in University(NCET-06-0712)
文摘In this paper, we propose an accelerated search-extension method (ASEM) based on the interpolated coefficient finite element method, the search-extension method (SEM) and the two-grid method to obtain the multiple solutions for semilinear elliptic equations. This strategy is not only successfully implemented to obtain multiple solutions for a class of semilinear elliptic boundary value problems, but also reduces the expensive computation greatly. The numerical results in I-D and 2-D cases will show the efficiency of our approach.
基金This work was supported by the Collaborative Innovation Center of Taiyuan Heavy Machinery Equipment,Postdoctoral Startup Fund of Taiyuan University of Science and Technology(20152034)the Natural Science Foundation of Shanxi Province(201701D221135)National College Students Innovation and Entrepreneurship Project(201710109003)and(201610109007).
文摘In this paper,the three-variable shifted Jacobi operational matrix of fractional derivatives is used together with the collocation method for numerical solution of threedimensional multi-term fractional-order PDEs with variable coefficients.The main characteristic behind this approach is that it reduces such problems to those of solving a system of algebraic equations which greatly simplifying the problem.The approximate solutions of nonlinear fractional PDEs with variable coefficients thus obtained by threevariable shifted Jacobi polynomials are compared with the exact solutions.Furthermore some theorems and lemmas are introduced to verify the convergence results of our algorithm.Lastly,several numerical examples are presented to test the superiority and efficiency of the proposed method.
基金This research was supported by the National Natural Science Foundation of China (10571053)Scientific Research Fund of Hunan Provincial Education Department (0513039)the Special Funds of State Major Basic Research Projects (G1999032804)
文摘This article will combine the finite element method, the interpolated coefficient finite element method, the eigenfunction expansion method, and the search-extension method to obtain the multiple solutions for semilinear elliptic equations. This strategy not only grently reduces the expensive computation, but also is successfully implemented to obtain multiple solutions for a class of semilinear elliptic boundary value problems with non-odd nonlinearity on some convex or nonconvex domains. Numerical solutions illustrated by their graphics for visualization will show the efficiency of the approach.
基金supported by the NSFC and the 973 key project of the MOST
文摘In this paper the authors consider the summability of formal solutions for some first order singular PDEs with irregular singularity. They prove that in this case the formal solutions will be divergent, but except a enumerable directions, the formal solutions are Borel summable.
文摘Flocking refers to collective behavior of a large number of interacting entities,where the interactions between discrete individuals produce collective motion on the large scale.We employ an agent-based model to describe the microscopic dynamics of each individual in a flock,and use a fractional partial differential equation(fPDE)to model the evolution of macroscopic quantities of interest.The macroscopic models with phenomenological interaction functions are derived by applying the continuum hypothesis to the microscopic model.Instead of specifying the fPDEs with an ad hoc fractional order for nonlocal flocking dynamics,we learn the effective nonlocal influence function in fPDEs directly from particle trajectories generated by the agent-based simulations.We demonstrate how the learning framework is used to connect the discrete agent-based model to the continuum fPDEs in one-and two-dimensional nonlocal flocking dynamics.In particular,a Cucker-Smale particle model is employed to describe the microscale dynamics of each individual,while Euler equations with nonlocal interaction terms are used to compute the evolution of macroscale quantities.The trajectories generated by the particle simulations mimic the field data of tracking logs that can be obtained experimentally.They can be used to learn the fractional order of the influence function using a Gaussian process regression model implemented with the Bayesian optimization.We show in one-and two-dimensional benchmarks that the numerical solution of the learned Euler equations solved by the finite volume scheme can yield correct density distributions consistent with the collective behavior of the agent-based system solved by the particle method.The proposed method offers new insights into how to scale the discrete agent-based models to the continuum-based PDE models,and could serve as a paradigm on extracting effective governing equations for nonlocal flocking dynamics directly from particle trajectories.
基金Open Access funding provided by ETH Zurich.The funding has been acknowledged.DSB acknowledges support via NSF grants NSF-19-04774,NSF-AST-2009776 and NASA-2020-1241.
文摘This paper examines a class of involution-constrained PDEs where some part of the PDE system evolves a vector field whose curl remains zero or grows in proportion to specified source terms.Such PDEs are referred to as curl-free or curl-preserving,respectively.They arise very frequently in equations for hyperelasticity and compressible multiphase flow,in certain formulations of general relativity and in the numerical solution of Schrödinger’s equation.Experience has shown that if nothing special is done to account for the curl-preserving vector field,it can blow up in a finite amount of simulation time.In this paper,we catalogue a class of DG-like schemes for such PDEs.To retain the globally curl-free or curl-preserving constraints,the components of the vector field,as well as their higher moments,must be collocated at the edges of the mesh.They are updated using potentials collocated at the vertices of the mesh.The resulting schemes:(i)do not blow up even after very long integration times,(ii)do not need any special cleaning treatment,(iii)can oper-ate with large explicit timesteps,(iv)do not require the solution of an elliptic system and(v)can be extended to higher orders using DG-like methods.The methods rely on a spe-cial curl-preserving reconstruction and they also rely on multidimensional upwinding.The Galerkin projection,highly crucial to the design of a DG method,is now conducted at the edges of the mesh and yields a weak form update that uses potentials obtained at the verti-ces of the mesh with the help of a multidimensional Riemann solver.A von Neumann sta-bility analysis of the curl-preserving methods is conducted and the limiting CFL numbers of this entire family of methods are catalogued in this work.The stability analysis confirms that with the increasing order of accuracy,our novel curl-free methods have superlative phase accuracy while substantially reducing dissipation.We also show that PNPM-like methods,which only evolve the lower moments while reconstructing the higher moments,retain much of the excellent wave propagation characteristics of the DG-like methods while offering a much larger CFL number and lower computational complexity.The quadratic energy preservation of these methods is also shown to be excellent,especially at higher orders.The methods are also shown to be curl-preserving over long integration times.