期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
Optimization Techniques for GPU-Based Parallel Programming Models in High-Performance Computing
1
作者 Shuntao Tang Wei Chen 《信息工程期刊(中英文版)》 2024年第1期7-11,共5页
This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from g... This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from graphic-centric processors to versatile computing units,it delves into the nuanced optimization of memory access,thread management,algorithmic design,and data structures.These optimizations are critical for exploiting the parallel processing capabilities of GPUs,addressingboth the theoretical frameworks and practical implementations.By integrating advanced strategies such as memory coalescing,dynamic scheduling,and parallel algorithmic transformations,this research aims to significantly elevate computational efficiency and throughput.The findings underscore the potential of optimized GPU programming to revolutionize computational tasks across various domains,highlighting a pathway towards achieving unparalleled processing power and efficiency in HPC environments.The paper not only contributes to the academic discourse on GPU optimization but also provides actionable insights for developers,fostering advancements in computational sciences and technology. 展开更多
关键词 Optimization Techniques gpu-based Parallel Programming Models High-Performance Computing
下载PDF
GPU-Based DEM Simulations of Global Ice Resistance on Ship Hull During Navigation in Level Ice 被引量:1
2
作者 HU Bing LIU Lu +1 位作者 WANG De-yu JI Shun-ying 《China Ocean Engineering》 SCIE EI CSCD 2021年第2期228-237,共10页
The ice resistance on a ship hull affects the safety of the hull structure and the ship maneuvering performance in icecovered regions.In this paper,the discrete element method(DEM)is adopted to simulate the interactio... The ice resistance on a ship hull affects the safety of the hull structure and the ship maneuvering performance in icecovered regions.In this paper,the discrete element method(DEM)is adopted to simulate the interaction between level ice and ship hull.The level ice is modeled with 3D bonded spherical elements considering the buoyancy and drag force of the water.The parallel bonding approach and the de-bonding criterion are adopted to model the freezing and breakage of level ice.The ship hull is constructed with rigid triangle elements.To improve computational efficiency,the GPU-based parallel computational algorithm was developed for the DEM simulations.During the interaction between the ship hull and level ice,the ice cover is broken into small blocks when the interparticle stress approaches the bonding strength.The global ice resistance on the hull is calculated through the contacts between ice elements and hull elements during the navigation process.The influences of the ice thickness and navigation speed on the dynamic ice force are analyzed considering the breakage mechanism of ice cover.The Lindqvist and Riska formulas for the determination of ice resistance on ship hull are employed to validate the DEM simulation.The comparison of results of DEM,Lindqvist,and Riska formula show that the DEM result is between those the Lindqvist formula and Riska formula.Therefore the proposed DEM is an effective approach to determine the ice resistance on the ship hull.This work can be aided in the hull structure design and the navigation operation in ice-covered fields. 展开更多
关键词 global ice resistance ship hull discrete element method level ice gpu-based parallel computation
下载PDF
An efficient GPU-based parallel tabu search algorithm for hardware/software co-design 被引量:5
3
作者 Neng Hou Fazhi He +1 位作者 Yi Zhou Yilin Chen 《Frontiers of Computer Science》 SCIE EI CSCD 2020年第5期135-152,共18页
Hardware/software partitioning is an essential step in hardware/software co-design.For large size problems,it is difficult to consider both solution quality and time.This paper presents an efficient GPU-based parallel... Hardware/software partitioning is an essential step in hardware/software co-design.For large size problems,it is difficult to consider both solution quality and time.This paper presents an efficient GPU-based parallel tabu search algorithm(GPTS)for HW/SW partitioning.A single GPU kernel of compacting neighborhood is proposed to reduce the amount of GPU global memory accesses theoretically.A kernel fusion strategy is further proposed to reduce the amount of GPU global memory accesses of GPTS.To further minimize the transfer overhead of GPTS between CPU and GPU,an optimized transfer strategy for GPU-based tabu evaluation is proposed,which considers that all the candidates do not satisfy the given constraint.Experiments show that GPTS outperforms state-of-the-art work of tabu search and is competitive with other methods for HW/SW partitioning.The proposed parallelization is significant when considering the ordinary GPU platform. 展开更多
关键词 hardware/software co-design hardware/software partitioning graphics processing unit gpu-based parallel tabu search single kernel implementation kernel fusion strategy optimized transfer strategy
原文传递
Real-time Virtual Environment Signal Extraction and Denoising Using Programmable Graphics Hardware
4
作者 Yang Su Zhi-Jie Xu Xiang-Qian Jiang 《International Journal of Automation and computing》 EI 2009年第4期326-334,共9页
The sense of being within a three-dimensional (3D) space and interacting with virtual 3D objects in a computer-generated virtual environment (VE) often requires essential image, vision and sensor signal processing... The sense of being within a three-dimensional (3D) space and interacting with virtual 3D objects in a computer-generated virtual environment (VE) often requires essential image, vision and sensor signal processing techniques such as differentiating and denoising. This paper describes novel implementations of the Gaussian filtering for characteristic signal extraction and waveletbased image denoising algorithms that run on the graphics processing unit (GPU). While significant acceleration over standard CPU implementations is obtained through exploiting data parallelism provided by the modern programmable graphics hardware, the CPU can be freed up to run other computations more efficiently such as artificial intelligence (AI) and physics. The proposed GPU-based Gaussian filtering can extract surface information from a real object and provide its material features for rendering and illumination. The wavelet-based signal denoising for large size digital images realized in this project provided better realism for VE visualization without sacrificing real-time and interactive performances of an application. 展开更多
关键词 Virtual environment graphics processing unit gpu-based Gaussian filtering signal denoising WAVELET
下载PDF
GPU Accelerated Real-Time Collision Handling in Virtual Disassembly 被引量:7
5
作者 杜鹏 赵杰伊 +1 位作者 潘万彬 王毅刚 《Journal of Computer Science & Technology》 SCIE EI CSCD 2015年第3期511-518,共8页
Previous collision detection methods for virtual disassembly mainly detect collisions at discrete time intervals and use oriented bounding boxes to speed up the process. However, these discrete methods cannot guarante... Previous collision detection methods for virtual disassembly mainly detect collisions at discrete time intervals and use oriented bounding boxes to speed up the process. However, these discrete methods cannot guarantee no penetration occurs when the components move. Meanwhile, because some of the components are embedded into each other, these components cannot be separated in the subsequent process. To solve these problems, we propose an approach for real-time collision handling by utilizing the computational power of modern GPUs. First we present a novel GPU-based collision handling framework for virtual disassembly. Second we use a collision-streams based continuous collision detection to guarantee no collision missed. Finally we introduce a triangle intersection detection algorithm to solve the problem that collision cannot be detected when the components are embedded into each other at the initial configuration. The experimental results show that our method can improve the overall performance of collision detection and achieve real-time simulation. 展开更多
关键词 gpu-based virtual disassembly continuous collision detection discrete collision detection collision handling
原文传递
3D large-scale SPH modeling of vehicle wading with GPU acceleration
6
作者 Huashan Zhang Xiaoxiao Li +1 位作者 Kewei Feng Moubin Liu 《Science China(Physics,Mechanics & Astronomy)》 SCIE EI CAS CSCD 2023年第10期70-91,共22页
Vehicle wading is a complex fluid-structure interaction(FSI) problem and has attracted great attention recently from the automotive industry, especially for electric vehicles. As a meshless Lagrangian particle method,... Vehicle wading is a complex fluid-structure interaction(FSI) problem and has attracted great attention recently from the automotive industry, especially for electric vehicles. As a meshless Lagrangian particle method, smoothed particle hydrodynamics(SPH) is one of the most suitable candidates for simulations of vehicle wading due to its inherent advantages in modeling free surface flows, splash, and moving interfaces. Nevertheless, the inevitable neighbor query for the nearest adjacent particles among the support domain leads to considerable computational cost and thus limits its application in 3D large-scale simulations. In this work, a GPU-based SPH method is developed with an adaptive spatial sort technology for simulations of vehicle wading. In addition, a fast, easy-to-implement particle generator is presented for isotropic initialization of the complex vehicle geometry with optimal interpolation properties. A comparative study of vehicle wading on a puddle between the GPUbased SPH with two pieces of commercial software is used to verify the capability of the GPU-based SPH method in terms of convergence analysis, kinematic characteristics, and computing performance. Finally, different conditions of vehicle speeds, water depths, and puddle widths are tested to investigate the vehicle wading numerically. The results demonstrate that the adaptive spatial sort technology can significantly improve the computing performance of the GPU-based SPH method and meanwhile promotes the GPU-based SPH method to be a competitive tool for the study of 3D large-scale FSI problems including vehicle wading. Some helpful findings of the critical vehicle speed, water depth as well as boundary wall effect are also reported in this work. 展开更多
关键词 vehicle wading fluid-structure interaction gpu-based SPH adaptive spatial sort technology
原文传递
Efficient electro-magnetic analysis of a GPU bitsliced AES implementation
7
作者 Yiwen Gao Yongbin Zhou Wei Cheng 《Cybersecurity》 CSCD 2020年第1期54-70,共17页
The advent of CUDA-enabled GPU makes it possible to provide cloud applications with high-performance data security services.Unfortunately,recent studies have shown that GPU-based applications are also susceptible to s... The advent of CUDA-enabled GPU makes it possible to provide cloud applications with high-performance data security services.Unfortunately,recent studies have shown that GPU-based applications are also susceptible to side-channel attacks.These published work studied the side-channel vulnerabilities of GPU-based AES implementations by taking the advantage of the cache sharing among multiple threads or high parallelism of GPUs.Therefore,for GPU-based bitsliced cryptographic implementations,which are immune to the cache-based attacks referred to above,only a power analysis method based on the high-parallelism of GPUs may be effective.However,the leakage model used in the power analysis is not efficient at all in practice.In light of this,we investigate electro-magnetic(EM)side-channel vulnerabilities of a GPU-based bitsliced AES implementation from the perspective of bit-level parallelism and thread-level parallelism in order to make the best of the localization effect of EM leakage with parallelism.Specifically,we propose efficient multi-bit and multi-thread combinational analysis techniques based on the intrinsic properties of bitsliced ciphers and the effect of multi-thread parallelism of GPUs,respectively.The experimental result shows that the proposed combinational analysis methods perform better than non-combinational and intuitive ones.Our research suggests that multi-thread leakages can be used to improve attacks if the multi-thread leakages are not synchronous in the time domain. 展开更多
关键词 gpu-based cryptographic implementations Side-channel analysis(SCA) Electro-magnetic attacks(EMA) Micro-architectural vulnerabilities Combinational analysis
原文传递
Efficient electro-magnetic analysis of a GPU bitsliced AES implementation
8
作者 Yiwen Gao Yongbin Zhou Wei Cheng 《Cybersecurity》 2018年第1期680-696,共17页
The advent of CUDA-enabled GPU makes it possible to provide cloud applications with high-performance data security services.Unfortunately,recent studies have shown that GPU-based applications are also susceptible to s... The advent of CUDA-enabled GPU makes it possible to provide cloud applications with high-performance data security services.Unfortunately,recent studies have shown that GPU-based applications are also susceptible to side-channel attacks.These published work studied the side-channel vulnerabilities of GPU-based AES implementations by taking the advantage of the cache sharing among multiple threads or high parallelism of GPUs.Therefore,for GPU-based bitsliced cryptographic implementations,which are immune to the cache-based attacks referred to above,only a power analysis method based on the high-parallelism of GPUs may be effective.However,the leakage model used in the power analysis is not efficient at all in practice.In light of this,we investigate electro-magnetic(EM)side-channel vulnerabilities of a GPU-based bitsliced AES implementation from the perspective of bit-level parallelism and thread-level parallelism in order to make the best of the localization effect of EM leakage with parallelism.Specifically,we propose efficient multi-bit and multi-thread combinational analysis techniques based on the intrinsic properties of bitsliced ciphers and the effect of multi-thread parallelism of GPUs,respectively.The experimental result shows that the proposed combinational analysis methods perform better than non-combinational and intuitive ones.Our research suggests that multi-thread leakages can be used to improve attacks if the multi-thread leakages are not synchronous in the time domain. 展开更多
关键词 gpu-based cryptographic implementations Side-channel analysis(SCA) Electro-magnetic attacks(EMA) Micro-architectural vulnerabilities Combinational analysis
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部