Training deep neural networks(DNNs)requires a significant amount of time and resources to obtain acceptable results,which severely limits its deployment in resource-limited platforms.This paper proposes DarkFPGA,a nov...Training deep neural networks(DNNs)requires a significant amount of time and resources to obtain acceptable results,which severely limits its deployment in resource-limited platforms.This paper proposes DarkFPGA,a novel customizable framework to efficiently accelerate the entire DNN training on a single FPGA platform.First,we explore batch-level parallelism to enable efficient FPGA-based DNN training.Second,we devise a novel hardware architecture optimised by a batch-oriented data pattern and tiling techniques to effectively exploit parallelism.Moreover,an analytical model is developed to determine the optimal design parameters for the DarkFPGA accelerator with respect to a specific network specification and FPGA resource constraints.Our results show that the accelerator is able to perform about 10 times faster than CPU training and about a third of the energy consumption than GPU training using 8-bit integers for training VGG-like networks on the CIFAR dataset for the Maxeler MAX5 platform.展开更多
Based on domain decomposition, a parallel two-level finite element method for the stationary Navier-Stokes equations is proposed and analyzed. The basic idea of the method is first to solve the Navier-Stokes equations...Based on domain decomposition, a parallel two-level finite element method for the stationary Navier-Stokes equations is proposed and analyzed. The basic idea of the method is first to solve the Navier-Stokes equations on a coarse grid, then to solve the resulted residual equations in parallel on a fine grid. This method has low communication complexity. It can be implemented easily. By local a priori error estimate for finite element discretizations, error bounds of the approximate solution are derived. Numerical results are also given to illustrate the high efficiency of the method.展开更多
Building on a new model proposed recently for calculating constant electro-magnetic field values, the present article explores the electro-magnetic field configuration generated by parallel electrical wires. This impo...Building on a new model proposed recently for calculating constant electro-magnetic field values, the present article explores the electro-magnetic field configuration generated by parallel electrical wires. This imposes a reevaluation of the drawing procedure for constructing field curves with a constant field values around multiple parallel electrical conducting wires. To achieve this, we employ methods akin to those used for creating contours on topographical maps, ensuring a consistent numerical field value along the entire length of the field curves. Subsequent calculations will be conducted for scenarios where wires are not parallel.展开更多
This paper is concerned with three-dimensional numerical simulation of a plunging liquid jet. The transient processes of forming an air cavity around the jet, capturing an initially large air bubble, and the break-up ...This paper is concerned with three-dimensional numerical simulation of a plunging liquid jet. The transient processes of forming an air cavity around the jet, capturing an initially large air bubble, and the break-up of this large toroidal-shaped bubble into smaller bubbles were analyzed. A stabilized finite element method (FEM) was employed under parallel numerical simulations based on adaptive, unstructured grid and coupled with a level-set method to track the interface between air and liquid. These simulations show that the inertia of the liquid jet initially depresses the pool's surface, forming an annular air cavity which surrounds the liquid jet. A toroidal liquid eddy which is subse- quently formed in the liquid pool results in air cavity collapse, and in turn entrains air into the liquid pool from the unstable annular air gap region around the liquid jet.展开更多
文摘Training deep neural networks(DNNs)requires a significant amount of time and resources to obtain acceptable results,which severely limits its deployment in resource-limited platforms.This paper proposes DarkFPGA,a novel customizable framework to efficiently accelerate the entire DNN training on a single FPGA platform.First,we explore batch-level parallelism to enable efficient FPGA-based DNN training.Second,we devise a novel hardware architecture optimised by a batch-oriented data pattern and tiling techniques to effectively exploit parallelism.Moreover,an analytical model is developed to determine the optimal design parameters for the DarkFPGA accelerator with respect to a specific network specification and FPGA resource constraints.Our results show that the accelerator is able to perform about 10 times faster than CPU training and about a third of the energy consumption than GPU training using 8-bit integers for training VGG-like networks on the CIFAR dataset for the Maxeler MAX5 platform.
基金Project supported by the National Natural Science Foundation of China(No.11001061)the Science and Technology Foundation of Guizhou Province of China(No.[2008]2123)
文摘Based on domain decomposition, a parallel two-level finite element method for the stationary Navier-Stokes equations is proposed and analyzed. The basic idea of the method is first to solve the Navier-Stokes equations on a coarse grid, then to solve the resulted residual equations in parallel on a fine grid. This method has low communication complexity. It can be implemented easily. By local a priori error estimate for finite element discretizations, error bounds of the approximate solution are derived. Numerical results are also given to illustrate the high efficiency of the method.
文摘Building on a new model proposed recently for calculating constant electro-magnetic field values, the present article explores the electro-magnetic field configuration generated by parallel electrical wires. This imposes a reevaluation of the drawing procedure for constructing field curves with a constant field values around multiple parallel electrical conducting wires. To achieve this, we employ methods akin to those used for creating contours on topographical maps, ensuring a consistent numerical field value along the entire length of the field curves. Subsequent calculations will be conducted for scenarios where wires are not parallel.
基金supported by the Office of Naval Research(Grant ONRDC14292111)
文摘This paper is concerned with three-dimensional numerical simulation of a plunging liquid jet. The transient processes of forming an air cavity around the jet, capturing an initially large air bubble, and the break-up of this large toroidal-shaped bubble into smaller bubbles were analyzed. A stabilized finite element method (FEM) was employed under parallel numerical simulations based on adaptive, unstructured grid and coupled with a level-set method to track the interface between air and liquid. These simulations show that the inertia of the liquid jet initially depresses the pool's surface, forming an annular air cavity which surrounds the liquid jet. A toroidal liquid eddy which is subse- quently formed in the liquid pool results in air cavity collapse, and in turn entrains air into the liquid pool from the unstable annular air gap region around the liquid jet.