In this study, a microchannel liquid cooling plate (LCP) is proposed for Intel Xeon 52.5 mm * 45 mm packaged architecture processors based on topology optimization (TO). Firstly, a mathematical model for topology opti...In this study, a microchannel liquid cooling plate (LCP) is proposed for Intel Xeon 52.5 mm * 45 mm packaged architecture processors based on topology optimization (TO). Firstly, a mathematical model for topology optimization design of the LCP is established based on heat dissipation and pressure drop objectives. We obtain a series of two-dimensional (2D) topology optimization configurations with different weighting factors for two objectives. It is found that the biomimetic phenomenon of the topologically optimized flow channel structure is more pronounced at low Reynolds numbers. Secondly, the topology configuration is stretched into a three-dimensional (3D) model to perform CFD simulations under actual operating conditions. The results show that the thermal resistance and pressure drop of the LCP based on topology optimization achieve a reduction of approximately 20% - 50% compared to traditional serpentine and microchannel straight flow channel structures. The Nusselt number can be improved by up to 76.1% compared to microchannel straight designs. Moreover, it is observed that under high flow rates, straight microchannel LCPs exhibit significant backflow, vortex phenomena, and topology optimization structures LCPs also tend to lead to loss of effectiveness in the form of tree root-shaped branch flows. Suitable flow rate ranges for LCPs are provided. Furthermore, the temperature and pressure drop of experimental results are consistent with the numerical ones, which verifies the effectiveness of performance for topology optimization flow channel LCP.展开更多
OpenCL is an open heterogeneous programming framework. Although OpenCL programs are func- tionally portable, they do not provide performance portability, so code transformation often plays an irreplaceable role. When ...OpenCL is an open heterogeneous programming framework. Although OpenCL programs are func- tionally portable, they do not provide performance portability, so code transformation often plays an irreplaceable role. When adapting GPU-specific OpenCL kernels to run on multi-core/many-core CPUs, coarsening the thread granularity is necessary and thus has been extensively used. However, locality concerns exposed in GPU-specific OpenCL code are usually inherited without analysis, which may give side-effects on the CPU performance. Typi- cally, the use of OpenCL's local memory on multi-core/many-core CPUs may lead to an opposite performance effect, because local-memory arrays no longer match well with the hardware and the associated synchronizations are costly. To solve this dilemma, we actively analyze the memory access patterns using array-access descriptors derived from GPU-specific kernels, which can thus be adapted for CPUs by (1) removing all the unwanted local-memory arrays together with the obsolete barrier statements and (2) optimizing the coalesced kernel code with vectorization and locality re-exploitation. Moreover, we have developed an automated tool chain that makes this transformation of GPU-specific OpenCL kernels into a CPU-friendly form, which is accompanied with a scheduler that forms a new OpenCL runtime. Experiments show that the automated transformation can improve OpenCL kernel performance on a multi-core CPU by an average factor of 3.24. Satisfactory performance improvements axe also achieved on Intel's many-integrated-core coprocessor. The resultant performance on both architectures is better than or comparable with the corresponding OpenMP performance.展开更多
The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive comp...The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive computational costs.To overcome this limitation,a message passing interface(MPI)parallel DEM-IMB-LBM framework is proposed aimed at enhancing computation efficiency.This framework utilises a static domain decomposition scheme,with the entire computation domain being decomposed into multiple subdomains according to predefined processors.A detailed parallel strategy is employed for both contact detection and hydrodynamic force calculation.In particular,a particle ID re-numbering scheme is proposed to handle particle transitions across sub-domain interfaces.Two benchmarks are conducted to validate the accuracy and overall performance of the proposed framework.Subsequently,the framework is applied to simulate scenarios involving multi-particle sedimentation and submarine landslides.The numerical examples effectively demonstrate the robustness and applicability of the MPI parallel DEM-IMB-LBM framework.展开更多
文摘In this study, a microchannel liquid cooling plate (LCP) is proposed for Intel Xeon 52.5 mm * 45 mm packaged architecture processors based on topology optimization (TO). Firstly, a mathematical model for topology optimization design of the LCP is established based on heat dissipation and pressure drop objectives. We obtain a series of two-dimensional (2D) topology optimization configurations with different weighting factors for two objectives. It is found that the biomimetic phenomenon of the topologically optimized flow channel structure is more pronounced at low Reynolds numbers. Secondly, the topology configuration is stretched into a three-dimensional (3D) model to perform CFD simulations under actual operating conditions. The results show that the thermal resistance and pressure drop of the LCP based on topology optimization achieve a reduction of approximately 20% - 50% compared to traditional serpentine and microchannel straight flow channel structures. The Nusselt number can be improved by up to 76.1% compared to microchannel straight designs. Moreover, it is observed that under high flow rates, straight microchannel LCPs exhibit significant backflow, vortex phenomena, and topology optimization structures LCPs also tend to lead to loss of effectiveness in the form of tree root-shaped branch flows. Suitable flow rate ranges for LCPs are provided. Furthermore, the temperature and pressure drop of experimental results are consistent with the numerical ones, which verifies the effectiveness of performance for topology optimization flow channel LCP.
基金Project supported by the National Natural Science Foundation of China(No.61272145)the National High-Tech R&D Program(863)of China(No.2012AA012706)
文摘OpenCL is an open heterogeneous programming framework. Although OpenCL programs are func- tionally portable, they do not provide performance portability, so code transformation often plays an irreplaceable role. When adapting GPU-specific OpenCL kernels to run on multi-core/many-core CPUs, coarsening the thread granularity is necessary and thus has been extensively used. However, locality concerns exposed in GPU-specific OpenCL code are usually inherited without analysis, which may give side-effects on the CPU performance. Typi- cally, the use of OpenCL's local memory on multi-core/many-core CPUs may lead to an opposite performance effect, because local-memory arrays no longer match well with the hardware and the associated synchronizations are costly. To solve this dilemma, we actively analyze the memory access patterns using array-access descriptors derived from GPU-specific kernels, which can thus be adapted for CPUs by (1) removing all the unwanted local-memory arrays together with the obsolete barrier statements and (2) optimizing the coalesced kernel code with vectorization and locality re-exploitation. Moreover, we have developed an automated tool chain that makes this transformation of GPU-specific OpenCL kernels into a CPU-friendly form, which is accompanied with a scheduler that forms a new OpenCL runtime. Experiments show that the automated transformation can improve OpenCL kernel performance on a multi-core CPU by an average factor of 3.24. Satisfactory performance improvements axe also achieved on Intel's many-integrated-core coprocessor. The resultant performance on both architectures is better than or comparable with the corresponding OpenMP performance.
基金financially supported by the National Natural Science Foundation of China(Grant Nos.12072217 and 42077254)the Natural Science Foundation of Hunan Province,China(Grant No.2022JJ30567).
文摘The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive computational costs.To overcome this limitation,a message passing interface(MPI)parallel DEM-IMB-LBM framework is proposed aimed at enhancing computation efficiency.This framework utilises a static domain decomposition scheme,with the entire computation domain being decomposed into multiple subdomains according to predefined processors.A detailed parallel strategy is employed for both contact detection and hydrodynamic force calculation.In particular,a particle ID re-numbering scheme is proposed to handle particle transitions across sub-domain interfaces.Two benchmarks are conducted to validate the accuracy and overall performance of the proposed framework.Subsequently,the framework is applied to simulate scenarios involving multi-particle sedimentation and submarine landslides.The numerical examples effectively demonstrate the robustness and applicability of the MPI parallel DEM-IMB-LBM framework.