The Fourier transform is very important to numerous applications in science and engineering. However, its usefulness is hampered by its computational expense. In this paper, in an attempt to develop a faster method fo...The Fourier transform is very important to numerous applications in science and engineering. However, its usefulness is hampered by its computational expense. In this paper, in an attempt to develop a faster method for computing Fourier transforms, the authors present parallel implementations of two new algorithms developed for the type IV Discrete Cosine Transform (DCT-IV) which support the new interleaved fast Fourier transform method. The authors discuss the realizations of their implementations using two paradigms. The first involved commodity equipment and the Message-Passing Interface (MPI) library. The second utilized the RapidMind development platform and the Cell Broadband Engine (BE) processor. These experiments indicate that the authors' rotation-based algorithm is preferable to their lifting-based algorithm on the platforms tested, with increased efficiency demonstrated by their MPI implementation for large data sets. Finally, the authors outline future work by discussing an architecture-oriented method for computing DCT-IVs which promises further optimization. The results indicate a promising fresh direction in the search for efficient ways to compute Fourier transforms.展开更多
In this paper we study the algorithms and their parallel implementation for solving large-scale generalized eigenvalue problems in modal analysis.Three predominant subspace algorithms,i.e.,Krylov-Schur method,implicit...In this paper we study the algorithms and their parallel implementation for solving large-scale generalized eigenvalue problems in modal analysis.Three predominant subspace algorithms,i.e.,Krylov-Schur method,implicitly restarted Arnoldi method and Jacobi-Davidson method,are modified with some complementary techniques to make them suitable for modal analysis.Detailed descriptions of the three algorithms are given.Based on these algorithms,a parallel solution procedure is established via the PANDA framework and its associated eigensolvers.Using the solution procedure on a machine equipped with up to 4800processors,the parallel performance of the three predominant methods is evaluated via numerical experiments with typical engineering structures,where the maximum testing scale attains twenty million degrees of freedom.The speedup curves for different cases are obtained and compared.The results show that the three methods are good for modal analysis in the scale of ten million degrees of freedom with a favorable parallel scalability.展开更多
文摘The Fourier transform is very important to numerous applications in science and engineering. However, its usefulness is hampered by its computational expense. In this paper, in an attempt to develop a faster method for computing Fourier transforms, the authors present parallel implementations of two new algorithms developed for the type IV Discrete Cosine Transform (DCT-IV) which support the new interleaved fast Fourier transform method. The authors discuss the realizations of their implementations using two paradigms. The first involved commodity equipment and the Message-Passing Interface (MPI) library. The second utilized the RapidMind development platform and the Cell Broadband Engine (BE) processor. These experiments indicate that the authors' rotation-based algorithm is preferable to their lifting-based algorithm on the platforms tested, with increased efficiency demonstrated by their MPI implementation for large data sets. Finally, the authors outline future work by discussing an architecture-oriented method for computing DCT-IVs which promises further optimization. The results indicate a promising fresh direction in the search for efficient ways to compute Fourier transforms.
基金supported by the National Defence Basic Fundamental Research Program of China(Grant No.C1520110002)the Fundamental Development Foundation of China Academy Engineering Physics(Grant No.2012A0202008)
文摘In this paper we study the algorithms and their parallel implementation for solving large-scale generalized eigenvalue problems in modal analysis.Three predominant subspace algorithms,i.e.,Krylov-Schur method,implicitly restarted Arnoldi method and Jacobi-Davidson method,are modified with some complementary techniques to make them suitable for modal analysis.Detailed descriptions of the three algorithms are given.Based on these algorithms,a parallel solution procedure is established via the PANDA framework and its associated eigensolvers.Using the solution procedure on a machine equipped with up to 4800processors,the parallel performance of the three predominant methods is evaluated via numerical experiments with typical engineering structures,where the maximum testing scale attains twenty million degrees of freedom.The speedup curves for different cases are obtained and compared.The results show that the three methods are good for modal analysis in the scale of ten million degrees of freedom with a favorable parallel scalability.