The time-dependent density functional-based tight-bind (TD-DFTB) method is implemented on the multi-core and the graphical processing unit (GPU) system for excited state calcu-lations of large system with hundreds...The time-dependent density functional-based tight-bind (TD-DFTB) method is implemented on the multi-core and the graphical processing unit (GPU) system for excited state calcu-lations of large system with hundreds or thousands of atoms. Sparse matrix and OpenMP multithreaded are used for building the Hamiltonian matrix. The diagonal of the eigenvalue problem in the ground state is implemented on the GPUs with double precision. The GPU- based acceleration fully preserves all the properties, and a considerable total speedup of 8.73 can be achieved. A Krylov-space-based algorithm with the OpenMP parallel and CPU acceleration is used for finding the lowest eigenvalue and eigenvector of the large TDDFT matrix, which greatly reduces the iterations taken and the time spent on the excited states eigenvalue problem. The Krylov solver with the GPU acceleration of matrix-vector product can converge quickly to obtain the final result and a notable speed-up of 206 times can be observed for system size of 812 atoms. The calculations on serials of small and large systems show that the fast TD-DFTB code can obtain reasonable result with a much cheaper computational requirement compared with the first-principle results of CIS and full TDDFT calculation.展开更多
文摘The time-dependent density functional-based tight-bind (TD-DFTB) method is implemented on the multi-core and the graphical processing unit (GPU) system for excited state calcu-lations of large system with hundreds or thousands of atoms. Sparse matrix and OpenMP multithreaded are used for building the Hamiltonian matrix. The diagonal of the eigenvalue problem in the ground state is implemented on the GPUs with double precision. The GPU- based acceleration fully preserves all the properties, and a considerable total speedup of 8.73 can be achieved. A Krylov-space-based algorithm with the OpenMP parallel and CPU acceleration is used for finding the lowest eigenvalue and eigenvector of the large TDDFT matrix, which greatly reduces the iterations taken and the time spent on the excited states eigenvalue problem. The Krylov solver with the GPU acceleration of matrix-vector product can converge quickly to obtain the final result and a notable speed-up of 206 times can be observed for system size of 812 atoms. The calculations on serials of small and large systems show that the fast TD-DFTB code can obtain reasonable result with a much cheaper computational requirement compared with the first-principle results of CIS and full TDDFT calculation.