A base function expressed with Chebyshev polynomials is reached. The relationship between the coefficients of the partial differential equation and the base function is deduced. Using the relationship, one can obtain ...A base function expressed with Chebyshev polynomials is reached. The relationship between the coefficients of the partial differential equation and the base function is deduced. Using the relationship, one can obtain nearly the same results as those calculated by Fast Fourier Transformation (FFT). The pseudo-spectral matrix method is applied in this paper to simulate numerically the incompressible laminar boundary flow on a plate. The simulation proves to be precise and efficient.展开更多
We present novel vector permutation and branch reduction methods to minimize the number of execution cycles for bit reversal algorithms.The new methods are applied to single instruction multiple data(SIMD) parallel im...We present novel vector permutation and branch reduction methods to minimize the number of execution cycles for bit reversal algorithms.The new methods are applied to single instruction multiple data(SIMD) parallel implementation of complex data floating-point fast Fourier transform(FFT).The number of operational clock cycles can be reduced by an average factor of 3.5 by using our vector permutation methods and by 1.1 by using our branch reduction methods,compared with conventional im-plementations.Experiments on MPC7448(a well-known SIMD reduced instruction set computing processor) demonstrate that our optimal bit-reversal algorithm consistently takes fewer than two cycles per element in complex array operations.展开更多
文摘A base function expressed with Chebyshev polynomials is reached. The relationship between the coefficients of the partial differential equation and the base function is deduced. Using the relationship, one can obtain nearly the same results as those calculated by Fast Fourier Transformation (FFT). The pseudo-spectral matrix method is applied in this paper to simulate numerically the incompressible laminar boundary flow on a plate. The simulation proves to be precise and efficient.
文摘We present novel vector permutation and branch reduction methods to minimize the number of execution cycles for bit reversal algorithms.The new methods are applied to single instruction multiple data(SIMD) parallel implementation of complex data floating-point fast Fourier transform(FFT).The number of operational clock cycles can be reduced by an average factor of 3.5 by using our vector permutation methods and by 1.1 by using our branch reduction methods,compared with conventional im-plementations.Experiments on MPC7448(a well-known SIMD reduced instruction set computing processor) demonstrate that our optimal bit-reversal algorithm consistently takes fewer than two cycles per element in complex array operations.