The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Obj...The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Object Model(DOM)based parsing,the performance degrades due to sequential processing and large memory requirements,thereby requiring an efficient XML parser to mitigate these issues.In this paper,we propose a Parallel XML Tree Generator(PXTG)algorithm for accelerating the parsing of XML files and a Regression-based XML Parsing Framework(RXPF)that analyzes and predicts performance through profiling,regression,and code generation for efficient parsing.The PXTG algorithm is based on dividing the XML file into n parts and producing n trees in parallel.The profiling phase of the RXPF framework produces a dataset by measuring the performance of various parsing models including StAX,SAX,DOM,JDOM,and PXTG on different cores by using multiple file sizes.The regression phase produces the prediction model,based on which the final code for efficient parsing of XML files is produced through the code generation phase.The RXPF framework has shown a significant improvement in performance varying from 9.54%to 32.34%over other existing models used for parsing XML files.展开更多
Parallelism在英语中具有修辞和语法双重功能。在英汉翻译对比研究中,学术界关注较多的是它与中文排比的构成差异及修辞功用,对它的语法功能研究较少。实践证明,英汉排比结构的对等转译不仅有助于保留源文本的语言特色和功能,还可以帮...Parallelism在英语中具有修辞和语法双重功能。在英汉翻译对比研究中,学术界关注较多的是它与中文排比的构成差异及修辞功用,对它的语法功能研究较少。实践证明,英汉排比结构的对等转译不仅有助于保留源文本的语言特色和功能,还可以帮助译文读者更好地把握原文结构,明了行文思路。Parallelism的修辞功能和语法功能同等重要,都应引起译界的足够重视。本文在功能对等理论的指导下,以美国作家M·斯科特·派克的心理学著作——The Road Less Traveled中的排比句的翻译为例,探讨如何通过直译法、转译法、分句法、逆序法、增词法和减词法来提高原文读者和译文读者反应的相似性,实现功能对等目的。展开更多
The complexity of an elastic wavefield increases the nonlinearity of inversion, To some extent, multiscale inversion decreases the nonlinearity of inversion and prevents it from falling into local extremes. A multisca...The complexity of an elastic wavefield increases the nonlinearity of inversion, To some extent, multiscale inversion decreases the nonlinearity of inversion and prevents it from falling into local extremes. A multiscale strategy based on the simultaneous use of frequency groups and layer stripping method based on damped wave field improves the stability of inversion. A dual-level parallel algorithm is then used to decrease the computational cost and improve practicability. The seismic wave modeling of a single frequency and inversion in a frequency group are computed in parallel by multiple nodes based on multifrontal massively parallel sparse direct solver and MPI. Numerical tests using an overthrust model show that the proposed inversion algorithm can effectively improve the stability and accuracy of inversion by selecting the appropriate inversion frequency and damping factor in low- frequency seismic data.展开更多
In this paper, the first boundary problem of quasilinear parabolic system of second order is studied by the finite difference method with intrinsic parallelism. for the problem, the stability of the difference schemes...In this paper, the first boundary problem of quasilinear parabolic system of second order is studied by the finite difference method with intrinsic parallelism. for the problem, the stability of the difference schemes with intrinsic parallelism are justified in the sense of the continuous dependence of the discrete vector solution of the difference schemes on the discrete data of the original problem, without assuming the existence of the smooth solutions for the origillal problem.展开更多
Training deep neural networks(DNNs)requires a significant amount of time and resources to obtain acceptable results,which severely limits its deployment in resource-limited platforms.This paper proposes DarkFPGA,a nov...Training deep neural networks(DNNs)requires a significant amount of time and resources to obtain acceptable results,which severely limits its deployment in resource-limited platforms.This paper proposes DarkFPGA,a novel customizable framework to efficiently accelerate the entire DNN training on a single FPGA platform.First,we explore batch-level parallelism to enable efficient FPGA-based DNN training.Second,we devise a novel hardware architecture optimised by a batch-oriented data pattern and tiling techniques to effectively exploit parallelism.Moreover,an analytical model is developed to determine the optimal design parameters for the DarkFPGA accelerator with respect to a specific network specification and FPGA resource constraints.Our results show that the accelerator is able to perform about 10 times faster than CPU training and about a third of the energy consumption than GPU training using 8-bit integers for training VGG-like networks on the CIFAR dataset for the Maxeler MAX5 platform.展开更多
文摘The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Object Model(DOM)based parsing,the performance degrades due to sequential processing and large memory requirements,thereby requiring an efficient XML parser to mitigate these issues.In this paper,we propose a Parallel XML Tree Generator(PXTG)algorithm for accelerating the parsing of XML files and a Regression-based XML Parsing Framework(RXPF)that analyzes and predicts performance through profiling,regression,and code generation for efficient parsing.The PXTG algorithm is based on dividing the XML file into n parts and producing n trees in parallel.The profiling phase of the RXPF framework produces a dataset by measuring the performance of various parsing models including StAX,SAX,DOM,JDOM,and PXTG on different cores by using multiple file sizes.The regression phase produces the prediction model,based on which the final code for efficient parsing of XML files is produced through the code generation phase.The RXPF framework has shown a significant improvement in performance varying from 9.54%to 32.34%over other existing models used for parsing XML files.
文摘Parallelism在英语中具有修辞和语法双重功能。在英汉翻译对比研究中,学术界关注较多的是它与中文排比的构成差异及修辞功用,对它的语法功能研究较少。实践证明,英汉排比结构的对等转译不仅有助于保留源文本的语言特色和功能,还可以帮助译文读者更好地把握原文结构,明了行文思路。Parallelism的修辞功能和语法功能同等重要,都应引起译界的足够重视。本文在功能对等理论的指导下,以美国作家M·斯科特·派克的心理学著作——The Road Less Traveled中的排比句的翻译为例,探讨如何通过直译法、转译法、分句法、逆序法、增词法和减词法来提高原文读者和译文读者反应的相似性,实现功能对等目的。
基金supported by the Natural Science Foundation of China(No.41374122)
文摘The complexity of an elastic wavefield increases the nonlinearity of inversion, To some extent, multiscale inversion decreases the nonlinearity of inversion and prevents it from falling into local extremes. A multiscale strategy based on the simultaneous use of frequency groups and layer stripping method based on damped wave field improves the stability of inversion. A dual-level parallel algorithm is then used to decrease the computational cost and improve practicability. The seismic wave modeling of a single frequency and inversion in a frequency group are computed in parallel by multiple nodes based on multifrontal massively parallel sparse direct solver and MPI. Numerical tests using an overthrust model show that the proposed inversion algorithm can effectively improve the stability and accuracy of inversion by selecting the appropriate inversion frequency and damping factor in low- frequency seismic data.
文摘In this paper, the first boundary problem of quasilinear parabolic system of second order is studied by the finite difference method with intrinsic parallelism. for the problem, the stability of the difference schemes with intrinsic parallelism are justified in the sense of the continuous dependence of the discrete vector solution of the difference schemes on the discrete data of the original problem, without assuming the existence of the smooth solutions for the origillal problem.
文摘Training deep neural networks(DNNs)requires a significant amount of time and resources to obtain acceptable results,which severely limits its deployment in resource-limited platforms.This paper proposes DarkFPGA,a novel customizable framework to efficiently accelerate the entire DNN training on a single FPGA platform.First,we explore batch-level parallelism to enable efficient FPGA-based DNN training.Second,we devise a novel hardware architecture optimised by a batch-oriented data pattern and tiling techniques to effectively exploit parallelism.Moreover,an analytical model is developed to determine the optimal design parameters for the DarkFPGA accelerator with respect to a specific network specification and FPGA resource constraints.Our results show that the accelerator is able to perform about 10 times faster than CPU training and about a third of the energy consumption than GPU training using 8-bit integers for training VGG-like networks on the CIFAR dataset for the Maxeler MAX5 platform.