Dynamic latency over the Internet is an important parameter for evaluating the performance of Web service orchestration.In this paper,we propose a performance analyzing and correctness checking method for service orch...Dynamic latency over the Internet is an important parameter for evaluating the performance of Web service orchestration.In this paper,we propose a performance analyzing and correctness checking method for service orchestration with dynamic latency simulated in Colored Petri-Nets(CPNs).First,we extend the CPN to Web Service Composition Orchestration Network System(WS-CONS) for the description of dynamic latency in service orchestration.Secondly,with simulated dynamic latency,a buffer-limited policy and admittance-control policy are designed in WSCONS and implemented on CPN Tools.In the buffer-limited policy,the passing messages would be discarded if the node capacity is not adequate.In the admittance-control policy,the ability of a message entering the system depends on the number of messages concurrently flowing in the system.This helps to enhance the success rate of message passing.Finally,the system performance is evaluated through running models in CPN Tools.Simulated results show that the dynamic latency plays an important role in the system throughput and response latency.This simulation helps system designers to quickly make proper compromises at low cost.展开更多
A geometric intrinsic pre-processing algorithm(GPA for short)for solving largescale discrete mathematical-physical PDE in 2-D and 3-D case has been presented by Sun(in 2022–2023).Different from traditional preconditi...A geometric intrinsic pre-processing algorithm(GPA for short)for solving largescale discrete mathematical-physical PDE in 2-D and 3-D case has been presented by Sun(in 2022–2023).Different from traditional preconditioning,the authors apply the intrinsic geometric invariance,the Grid matrix G and the discrete PDE mass matrix B,stiff matrix A satisfies commutative operator BG=GB and AG=GA,where G satisfies G^(m)=I,m<<dim(G).A large scale system solvers can be replaced to a more smaller block-solver as a pretreatment in real or complex domain.In this paper,the authors expand their research to 2-D and 3-D mathematical physical equations over more wide polyhedron grids such as triangle,square,tetrahedron,cube,and so on.They give the general form of pre-processing matrix,theory and numerical test of GPA.The conclusion that“the parallelism of geometric mesh pre-transformation is mainly proportional to the number of faces of polyhedron”is obtained through research,and it is further found that“commutative of grid mesh matrix and mass matrix is an important basis for the feasibility and reliability of GPA algorithm”.展开更多
Face detect application has a real time need in nature. Although Viola-Jones algorithm can handle it elegantly, today's bigger and bigger high quality images and videos still bring in the new challenge of real time n...Face detect application has a real time need in nature. Although Viola-Jones algorithm can handle it elegantly, today's bigger and bigger high quality images and videos still bring in the new challenge of real time needs. It is a good idea to parallel the Viola-Jones algorithm with OpenCL to achieve high performance across both AMD and NVidia GPU platforms without bringing up new algorithms. This paper presents the bottleneck of this application and discusses how to optimize the face detection step by step from a very naive implementation. Some brilliant tricks and methods like CPU execution time hidden, stubbles usage of local memory as high speed scratchpad and manual cache, and variable granularity were used to improve the performance. Those technologies result in 4-13 times speedup varying with the image size. Furthermore those ideas may throw on some light on the way to parallel applications efficiently with OpenCL. Taking face detection as an example, this paper also summarizes some universal advice on how to optimize OpenCL program, trying to help other applications do better on GPU.展开更多
The numerical solution of the differential-algebraic equations(DAEs) involved in time domain simulation(TDS) of power systems requires the solution of a sequence of large scale and sparse linear systems.The use of ite...The numerical solution of the differential-algebraic equations(DAEs) involved in time domain simulation(TDS) of power systems requires the solution of a sequence of large scale and sparse linear systems.The use of iterative methods such as the Krylov subspace method is imperative for the solution of these large and sparse linear systems.The motivation of the present work is to develop a new algorithm to efficiently precondition the whole sequence of linear systems involved in TDS.As an improvement of dishonest preconditioner(DP) strategy,updating preconditioner strategy(UP) is introduced to the field of TDS for the first time.The idea of updating preconditioner strategy is based on the fact that the matrices in sequence of the linearized systems are continuous and there is only a slight difference between two consecutive matrices.In order to make the linear system sequence in TDS suitable for UP strategy,a matrix transformation is applied to form a new linear sequence with a good shape for preconditioner updating.The algorithm proposed in this paper has been tested with 4 cases from real-life power systems in China.Results show that the proposed UP algorithm efficiently preconditions the sequence of linear systems and reduces 9%-61% the iteration count of the GMRES when compared with the DP method in all test cases.Numerical experiments also show the effectiveness of UP when combined with simple preconditioner reconstruction strategies.展开更多
基金This paper was supported by the National Natural Science Foundation of China under Grants No.61170053,No.61101214,No.61100205,the National High-Tech Research and Development Plan of China under Grant No.2012AA010902-1,the Natural Science Foundation of Beijing under Grant No.4112027,Special Project of National CAS Union-The High Performace Cloud Service Platform for Enterprise Creative Computing
文摘Dynamic latency over the Internet is an important parameter for evaluating the performance of Web service orchestration.In this paper,we propose a performance analyzing and correctness checking method for service orchestration with dynamic latency simulated in Colored Petri-Nets(CPNs).First,we extend the CPN to Web Service Composition Orchestration Network System(WS-CONS) for the description of dynamic latency in service orchestration.Secondly,with simulated dynamic latency,a buffer-limited policy and admittance-control policy are designed in WSCONS and implemented on CPN Tools.In the buffer-limited policy,the passing messages would be discarded if the node capacity is not adequate.In the admittance-control policy,the ability of a message entering the system depends on the number of messages concurrently flowing in the system.This helps to enhance the success rate of message passing.Finally,the system performance is evaluated through running models in CPN Tools.Simulated results show that the dynamic latency plays an important role in the system throughput and response latency.This simulation helps system designers to quickly make proper compromises at low cost.
基金supported by the Basic Research Plan on High Performance Computing of Institute of Software(No.ISCAS-PYFX-202302)the National Key R&D Program of China(No.2020YFB1709502)the Advanced Space Propulsion Laboratory of BICE and Beijing Engineering Research Center of Efficient and Green Aerospace Propulsion Technology(No.Lab ASP-2019-03)。
文摘A geometric intrinsic pre-processing algorithm(GPA for short)for solving largescale discrete mathematical-physical PDE in 2-D and 3-D case has been presented by Sun(in 2022–2023).Different from traditional preconditioning,the authors apply the intrinsic geometric invariance,the Grid matrix G and the discrete PDE mass matrix B,stiff matrix A satisfies commutative operator BG=GB and AG=GA,where G satisfies G^(m)=I,m<<dim(G).A large scale system solvers can be replaced to a more smaller block-solver as a pretreatment in real or complex domain.In this paper,the authors expand their research to 2-D and 3-D mathematical physical equations over more wide polyhedron grids such as triangle,square,tetrahedron,cube,and so on.They give the general form of pre-processing matrix,theory and numerical test of GPA.The conclusion that“the parallelism of geometric mesh pre-transformation is mainly proportional to the number of faces of polyhedron”is obtained through research,and it is further found that“commutative of grid mesh matrix and mass matrix is an important basis for the feasibility and reliability of GPA algorithm”.
基金Supported by the National Natural Science Foundation of China (No. 61133005)the National High-Tech Research and Development (863) Program of China (No. 2012AA010902)
文摘Face detect application has a real time need in nature. Although Viola-Jones algorithm can handle it elegantly, today's bigger and bigger high quality images and videos still bring in the new challenge of real time needs. It is a good idea to parallel the Viola-Jones algorithm with OpenCL to achieve high performance across both AMD and NVidia GPU platforms without bringing up new algorithms. This paper presents the bottleneck of this application and discusses how to optimize the face detection step by step from a very naive implementation. Some brilliant tricks and methods like CPU execution time hidden, stubbles usage of local memory as high speed scratchpad and manual cache, and variable granularity were used to improve the performance. Those technologies result in 4-13 times speedup varying with the image size. Furthermore those ideas may throw on some light on the way to parallel applications efficiently with OpenCL. Taking face detection as an example, this paper also summarizes some universal advice on how to optimize OpenCL program, trying to help other applications do better on GPU.
基金supported by the National Natural Science Foundation of China (Grant Nos. 60703055 and 60803019)the National High-Tech Research & Development Program of China ("863" Program) (Grant No. 2009AA01A129)+1 种基金State Key Development Program of Basic Research of China (Grant No. 2010CB951903)Tsinghua National Laboratory for Information Science and Technology (THList) Cross-discipline Foundation
文摘The numerical solution of the differential-algebraic equations(DAEs) involved in time domain simulation(TDS) of power systems requires the solution of a sequence of large scale and sparse linear systems.The use of iterative methods such as the Krylov subspace method is imperative for the solution of these large and sparse linear systems.The motivation of the present work is to develop a new algorithm to efficiently precondition the whole sequence of linear systems involved in TDS.As an improvement of dishonest preconditioner(DP) strategy,updating preconditioner strategy(UP) is introduced to the field of TDS for the first time.The idea of updating preconditioner strategy is based on the fact that the matrices in sequence of the linearized systems are continuous and there is only a slight difference between two consecutive matrices.In order to make the linear system sequence in TDS suitable for UP strategy,a matrix transformation is applied to form a new linear sequence with a good shape for preconditioner updating.The algorithm proposed in this paper has been tested with 4 cases from real-life power systems in China.Results show that the proposed UP algorithm efficiently preconditions the sequence of linear systems and reduces 9%-61% the iteration count of the GMRES when compared with the DP method in all test cases.Numerical experiments also show the effectiveness of UP when combined with simple preconditioner reconstruction strategies.