期刊文献+
共找到72篇文章
< 1 2 4 >
每页显示 20 50 100
TIME-DOMAIN INTERPOLATION ON GRAPHICS PROCESSING UNIT 被引量:1
1
作者 XIQI LI GUOHUA SHI YUDONG ZHANG 《Journal of Innovative Optical Health Sciences》 SCIE EI CAS 2011年第1期89-95,共7页
The signal processing speed of spectral domain optical coherence tomography(SD-OCT)has become a bottleneck in a lot of medical applications.Recently,a time-domain interpolation method was proposed.This method can get ... The signal processing speed of spectral domain optical coherence tomography(SD-OCT)has become a bottleneck in a lot of medical applications.Recently,a time-domain interpolation method was proposed.This method can get better signal-to-noise ratio(SNR)but much-reduced signal processing time in SD-OCT data processing as compared with the commonly used zeropadding interpolation method.Additionally,the resampled data can be obtained by a few data and coefficients in the cutoff window.Thus,a lot of interpolations can be performed simultaneously.So,this interpolation method is suitable for parallel computing.By using graphics processing unit(GPU)and the compute unified device architecture(CUDA)program model,time-domain interpolation can be accelerated significantly.The computing capability can be achieved more than 250,000 A-lines,200,000 A-lines,and 160,000 A-lines in a second for 2,048 pixel OCT when the cutoff length is L=11,L=21,and L=31,respectively.A frame SD-OCT data(400A-lines×2,048 pixel per line)is acquired and processed on GPU in real time.The results show that signal processing time of SD-OCT can befinished in 6.223 ms when the cutoff length L=21,which is much faster than that on central processing unit(CPU).Real-time signal processing of acquired data can be realized. 展开更多
关键词 Optical coherence tomography real-time signal processing graphics processing unit GPU CUDA
下载PDF
Simulation of fluid-structure interaction in a microchannel using the lattice Boltzmann method and size-dependent beam element on a graphics processing unit
2
作者 Vahid Esfahanian Esmaeil Dehdashti Amir Mehdi Dehrouye-Semnani 《Chinese Physics B》 SCIE EI CAS CSCD 2014年第8期389-395,共7页
Fluid-structure interaction (FSI) problems in microchannels play a prominent role in many engineering applications. The present study is an effort toward the simulation of flow in microchannel considering FSI. The b... Fluid-structure interaction (FSI) problems in microchannels play a prominent role in many engineering applications. The present study is an effort toward the simulation of flow in microchannel considering FSI. The bottom boundary of the microchannel is simulated by size-dependent beam elements for the finite element method (FEM) based on a modified cou- ple stress theory. The lattice Boltzmann method (LBM) using the D2Q13 LB model is coupled to the FEM in order to solve the fluid part of the FSI problem. Because of the fact that the LBM generally needs only nearest neighbor information, the algorithm is an ideal candidate for parallel computing. The simulations are carried out on graphics processing units (GPUs) using computed unified device architecture (CUDA). In the present study, the governing equations are non-dimensionalized and the set of dimensionless groups is exhibited to show their effects on micro-beam displacement. The numerical results show that the displacements of the micro-beam predicted by the size-dependent beam element are smaller than those by the classical beam element. 展开更多
关键词 fluid-structure interaction graphics processing unit lattice Boltzmann method size-dependentbeam element
下载PDF
Compute Unified Device Architecture Implementation of Euler/Navier-Stokes Solver on Graphics Processing Unit Desktop Platform for 2-D Compressible Flows
3
作者 Zhang Jiale Chen Hongquan 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI CSCD 2016年第5期536-545,共10页
Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/N... Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/Navier-Stokes solver is developed for 2-D compressible flows by using NVIDIA′s Compute Unified Device Architecture(CUDA)programming model in CUDA Fortran programming language.The techniques of implementation of CUDA kernels,double-layered thread hierarchy and variety memory hierarchy are presented to form the GPU-based algorithm of Euler/Navier-Stokes equations.The resulting parallel solver is validated by a set of typical test flow cases.The numerical results show that dozens of times speedup relative to a serial CPU implementation can be achieved using a single GPU desktop platform,which demonstrates that a GPU desktop can serve as a costeffective parallel computing platform to accelerate computational fluid dynamics(CFD)simulations substantially. 展开更多
关键词 graphics processing unit(GPU) GPU parallel computing compute unified device architecture(CUDA)Fortran finite volume method(FVM) acceleration
下载PDF
Multi-relaxation-time lattice Boltzmann simulations of lid driven flows using graphics processing unit
4
作者 Chenggong LI J.P.Y.MAA 《Applied Mathematics and Mechanics(English Edition)》 SCIE EI CSCD 2017年第5期707-722,共16页
Large eddy simulation (LES) using the Smagorinsky eddy viscosity model is added to the two-dimensional nine velocity components (D2Q9) lattice Boltzmann equation (LBE) with multi-relaxation-time (MRT) to simul... Large eddy simulation (LES) using the Smagorinsky eddy viscosity model is added to the two-dimensional nine velocity components (D2Q9) lattice Boltzmann equation (LBE) with multi-relaxation-time (MRT) to simulate incompressible turbulent cavity flows with the Reynolds numbers up to 1 × 10^7. To improve the computation efficiency of LBM on the numerical simulations of turbulent flows, the massively parallel computing power from a graphic processing unit (GPU) with a computing unified device architecture (CUDA) is introduced into the MRT-LBE-LES model. The model performs well, compared with the results from others, with an increase of 76 times in computation efficiency. It appears that the higher the Reynolds numbers is, the smaller the Smagorinsky constant should be, if the lattice number is fixed. Also, for a selected high Reynolds number and a selected proper Smagorinsky constant, there is a minimum requirement for the lattice number so that the Smagorinsky eddy viscosity will not be excessively large. 展开更多
关键词 large eddy simulation (LES) multi-relaxation-time (MRT) lattice Boltzmann equation (LBE) two-dimensional nine velocity components (D2Q9) Smagorinskymodel graphic processing unit (GPU) computing unified device architecture (CUDA)
下载PDF
Complex hexagonal close-packed dendritic growth during alloy solidification by graphics processing unit-accelerated three-dimensional phase-field simulations:demo for Mg–Gd alloy
5
作者 Sheng-Lan Yang Jing Zhong +5 位作者 Kai Wang Xun Kang Jian-Bao Gao Jiong Wang Qian Li Li-Jun Zhang 《Rare Metals》 SCIE EI CAS CSCD 2023年第10期3468-3484,共17页
In this study,insights into the effect of interfacial anisotropy on a complex hexagonal close-packed(hcp) dendritic growth during alloy solidification were gained by graphics processing unit(GPU)-accelerated three-dim... In this study,insights into the effect of interfacial anisotropy on a complex hexagonal close-packed(hcp) dendritic growth during alloy solidification were gained by graphics processing unit(GPU)-accelerated three-dimensional(3D) phase-field simulations,as demonstrated for a Mg-Gd alloy.An anisotropic phasefield model with finite interface dissipation was developed by incorporating the contribution of the anisotropy of interfacial energy into the total free energy functional.The modified spherical harmonic anisotropy function was then chosen for the hcp crystal.The GPU parallel computing algorithm was implemented in the present phase-field model,and a corresponding code was developed in the compute unified device architecture parallel computing platform.Benchmark tests indicated that the calculation efficiency of a single TESLA V100 GPU could be~80times that of open multi-processing(OpenMP) with eight central processing unit cores.By coupling the phase-field model with reliable thermodynamic and interfacial energy descriptions,the 3D phase-field simulation of α-Mg dendritic growth in the Mg-6Gd(in wt%) alloy during solidification was performed.Various two-dimensional dendrite morphologies were revealed by cutting the simulated 3D dendrite along different crystallographic planes.Typical sixfold equiaxed and butterflied microstructures observed in experiments were well reproduced. 展开更多
关键词 Interfacial anisotropy Dendrite solidification Phase-field model graphics processing unit(GPU) Mg–Gd
原文传递
Real-time color holographic video reconstruction using multiple-graphics processing unit cluster acceleration and three spatial light modulators 被引量:4
6
作者 Shohei Ikawa Naoki Takada +8 位作者 Hiromitsu Araki Hiroaki Niwase Hiromi Sannomiya Hirotaka Nakayama Minoru Oikawa Yuichiro Mori Takashi Kakue Tomoyoshi Shimobaba Tomoyoshi Ito 《Chinese Optics Letters》 SCIE EI CAS CSCD 2020年第1期18-22,共5页
We demonstrate real-time three-dimensional(3D)color video using a color electroholographic system with a cluster of multiple-graphics processing units(multi-GPU)and three spatial light modulators(SLMs)corresponding re... We demonstrate real-time three-dimensional(3D)color video using a color electroholographic system with a cluster of multiple-graphics processing units(multi-GPU)and three spatial light modulators(SLMs)corresponding respectively to red,green,and blue(RGB)-colored reconstructing lights.The multi-GPU cluster has a computer-generated hologram(CGH)display node containing a GPU,for displaying calculated CGHs on SLMs,and four CGH calculation nodes using 12 GPUs.The GPUs in the CGH calculation node generate CGHs corresponding to RGB reconstructing lights in a 3D color video using pipeline processing.Real-time color electroholography was realized for a 3D color object comprising approximately 21,000 points per color. 展开更多
关键词 color electroholography real-time electroholography multiple-graphics processing unit cluster graphics processing unit
原文传递
Parallel Image Processing: Taking Grayscale Conversion Using OpenMP as an Example
7
作者 Bayan AlHumaidan Shahad Alghofaily +2 位作者 Maitha Al Qhahtani Sara Oudah Naya Nagy 《Journal of Computer and Communications》 2024年第2期1-10,共10页
In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularl... In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks. 展开更多
关键词 Parallel Computing Image processing OPENMP Parallel Programming High Performance Computing GPU (Graphic processing Unit)
下载PDF
Real-time spatiotemporal division multiplexing electroholography for 1,200,000 object points using multiple-graphics processing unit cluster 被引量:1
8
作者 Hiromi Sannomiya Naoki Takada +7 位作者 Kohei Suzuki Tomoya Sakaguchi Hirotaka Nakayama Minoru Oikawa Yuichiro Mori Takashi Kakue Tomoyoshi Shimobaba Tomoyoshi Ito 《Chinese Optics Letters》 SCIE EI CAS CSCD 2020年第7期28-32,共5页
Computationally, the calculation of computer-generated holograms is extremely expensive, and the image quality deteriorates when reconstructing three-dimensional(3 D) holographic video from a point-cloud model compris... Computationally, the calculation of computer-generated holograms is extremely expensive, and the image quality deteriorates when reconstructing three-dimensional(3 D) holographic video from a point-cloud model comprising a huge number of object points. To solve these problems, we implement herein a spatiotemporal division multiplexing method on a cluster system with 13 GPUs connected by a gigabit Ethernet network.A performance evaluation indicates that the proposed method can realize a real-time holographic video of a3 D object comprising ~1,200,000 object points. These results demonstrate a clear 3 D holographic video at32.7 frames per second reconstructed from a 3 D object comprising 1,064,462 object points. 展开更多
关键词 real-time electroholography multiple-graphics processing unit cluster graphics processing unit spatiotemporal division multiplexing electroholography
原文传递
A graphics processing unit-based robust numerical model for solute transport driven by torrential flow condition 被引量:1
9
作者 Jing-ming HOU Bao-shan SHI +6 位作者 Qiu-hua LIANG Yu TONG Yong-de KANG Zhao-an ZHANG Gang-gang BAI Xu-jun GAO Xiao YANG 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2021年第10期835-850,共16页
Solute transport simulations are important in water pollution events.This paper introduces a finite volume Godunovtype model for solving a 4×4 matrix form of the hyperbolic conservation laws consisting of 2D shal... Solute transport simulations are important in water pollution events.This paper introduces a finite volume Godunovtype model for solving a 4×4 matrix form of the hyperbolic conservation laws consisting of 2D shallow water equations and transport equations.The model adopts the Harten-Lax-van Leer-contact(HLLC)-approximate Riemann solution to calculate the cell interface fluxes.It can deal well with the changes in the dry and wet interfaces in an actual complex terrain,and it has a strong shock-wave capturing ability.Using monotonic upstream-centred scheme for conservation laws(MUSCL)linear reconstruction with finite slope and the Runge-Kutta time integration method can achieve second-order accuracy.At the same time,the introduction of graphics processing unit(GPU)-accelerated computing technology greatly increases the computing speed.The model is validated against multiple benchmarks,and the results are in good agreement with analytical solutions and other published numerical predictions.The third test case uses the GPU and central processing unit(CPU)calculation models which take 3.865 s and 13.865 s,respectively,indicating that the GPU calculation model can increase the calculation speed by 3.6 times.In the fourth test case,comparing the numerical model calculated by GPU with the traditional numerical model calculated by CPU,the calculation efficiencies of the numerical model calculated by GPU under different resolution grids are 9.8–44.6 times higher than those by CPU.Therefore,it has better potential than previous models for large-scale simulation of solute transport in water pollution incidents.It can provide a reliable theoretical basis and strong data support in the rapid assessment and early warning of water pollution accidents. 展开更多
关键词 Solute transport Shallow water equations Godunov-type scheme Harten-Lax-van Leer-contact(HLLC)Riemann solver graphics processing unit(GPU)acceleration technology Torrential flow
原文传递
Exploiting Parallelism in the Simulation of General Purpose Graphics Processing Unit Program
10
作者 赵夏 马胜 +1 位作者 陈微 王志英 《Journal of Shanghai Jiaotong university(Science)》 EI 2016年第3期280-288,共9页
The simulation is an important means of performance evaluation of the computer architecture. Nowadays, the serial simulation of general purpose graphics processing unit(GPGPU) architecture is the main bottleneck for t... The simulation is an important means of performance evaluation of the computer architecture. Nowadays, the serial simulation of general purpose graphics processing unit(GPGPU) architecture is the main bottleneck for the simulation speed. To address this issue, we propose the intra-kernel parallelization on a multicore processor and the inter-kernel parallelization on a multiple-machine platform. We apply these two methods to the GPGPU-sim simulator. The intra-kernel parallelization method firstly parallelizes the serial simulation of multiple compute units in one cycle. Then it parallelizes the timing and functional simulation to reduce the performance loss caused by the synchronization between different compute units. The inter-kernel parallelization method divides multiple kernels of a CUDA program into several groups and distributes these groups across multiple simulation hosts to perform the simulation. Experimental results show that the intra-kernel parallelization method achieves a speed-up of up to 12 with a maximum error rate of 0.009 4% on a 32-core machine, and the inter-kernel parallelization method can accelerate the simulation by a factor of up to 3.9 with a maximum error rate of 0.11% on four simulation hosts. The orthogonality between these two methods allows us to combine them together on multiple multi-core hosts to get further performance improvements. 展开更多
关键词 general purpose graphics processing unit(GPGPU) MULTICORE intra-kernel inter-kernel parallel
原文传递
Parallelizing maximum likelihood classification on computer cluster and graphics processing unit for supervised image classification
11
作者 Xuan Shi Bowei Xue 《International Journal of Digital Earth》 SCIE EI 2017年第7期737-748,共12页
Supervised image classification has been widely utilized in a variety of remote sensing applications.When large volume of satellite imagery data and aerial photos are increasingly available,high-performance image proc... Supervised image classification has been widely utilized in a variety of remote sensing applications.When large volume of satellite imagery data and aerial photos are increasingly available,high-performance image processing solutions are required to handle large scale of data.This paper introduces how maximum likelihood classification approach is parallelized for implementation on a computer cluster and a graphics processing unit to achieve high performance when processing big imagery data.The solution is scalable and satisfies the need of change detection,object identification,and exploratory analysis on large-scale high-resolution imagery data in remote sensing applications. 展开更多
关键词 Maximum likelihood classification supervised classification parallel computing graphics processing unit
原文传递
Bypass-Enabled Thread Compaction for Divergent Control Flow in Graphics Processing Units
12
作者 李炳超 魏继增 +1 位作者 郭炜 孙济洲 《Journal of Shanghai Jiaotong university(Science)》 EI 2021年第2期245-256,共12页
Graphics processing units(GPUs)employ the single instruction multiple data(SIMD)hardware to run threads in parallel and allow each thread to maintain an arbitrary control flow.Threads running concurrently within a war... Graphics processing units(GPUs)employ the single instruction multiple data(SIMD)hardware to run threads in parallel and allow each thread to maintain an arbitrary control flow.Threads running concurrently within a warp may jump to different paths after conditional branches.Such divergent control flow makes some lanes idle and hence reduces the SIMD utilization of GPUs.To alleviate the waste of SIMD lanes,threads from multiple warps can be collected together to improve the SIMD lane utilization by compacting threads into idle lanes.However,this mechanism induces extra barrier synchronizations since warps have to be stalled to wait for other warps for compactions,resulting in that no warps are scheduled in some cases.In this paper,we propose an approach to reduce the overhead of barrier synchronizat ions induced by compactions,In our approach,a compaction is bypassed by warps whose threads all jump to the same path after branches.Moreover,warps waiting for a compaction can also bypass this compaction when no warps are ready for issuing.In addition,a compaction is canceled if idle lanes can not be reduced via this compaction.The experimental results demonstrate that our approach provides an average improvement of 21%over the baseline GPU for applications with massive divergent branches,while recovering the performance loss induced by compactions by 13%on average for applications with many non-divergent control flows. 展开更多
关键词 graphics processing unit(GPU) single instruction ultiple data(SIMD) THREAD warps BYPASS
原文传递
Volumetric lattice Boltzmann method for pore-scale mass diffusionadvection process in geopolymer porous structures
13
作者 Xiaoyu Zhang Zirui Mao +6 位作者 Floyd W.Hilty Yulan Li Agnes Grandjean Robert Montgomery Hans-Conrad zur Loye Huidan Yu Shenyang Hu 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第6期2126-2136,共11页
Porous materials present significant advantages for absorbing radioactive isotopes in nuclear waste streams.To improve absorption efficiency in nuclear waste treatment,a thorough understanding of the diffusion-advecti... Porous materials present significant advantages for absorbing radioactive isotopes in nuclear waste streams.To improve absorption efficiency in nuclear waste treatment,a thorough understanding of the diffusion-advection process within porous structures is essential for material design.In this study,we present advancements in the volumetric lattice Boltzmann method(VLBM)for modeling and simulating pore-scale diffusion-advection of radioactive isotopes within geopolymer porous structures.These structures are created using the phase field method(PFM)to precisely control pore architectures.In our VLBM approach,we introduce a concentration field of an isotope seamlessly coupled with the velocity field and solve it by the time evolution of its particle population function.To address the computational intensity inherent in the coupled lattice Boltzmann equations for velocity and concentration fields,we implement graphics processing unit(GPU)parallelization.Validation of the developed model involves examining the flow and diffusion fields in porous structures.Remarkably,good agreement is observed for both the velocity field from VLBM and multiphysics object-oriented simulation environment(MOOSE),and the concentration field from VLBM and the finite difference method(FDM).Furthermore,we investigate the effects of background flow,species diffusivity,and porosity on the diffusion-advection behavior by varying the background flow velocity,diffusion coefficient,and pore volume fraction,respectively.Notably,all three parameters exert an influence on the diffusion-advection process.Increased background flow and diffusivity markedly accelerate the process due to increased advection intensity and enhanced diffusion capability,respectively.Conversely,increasing the porosity has a less significant effect,causing a slight slowdown of the diffusion-advection process due to the expanded pore volume.This comprehensive parametric study provides valuable insights into the kinetics of isotope uptake in porous structures,facilitating the development of porous materials for nuclear waste treatment applications. 展开更多
关键词 Volumetric lattice Boltzmann method(VLBM) Phase field method(PFM) Pore-scale diffusion-advection Nuclear waste treatment Porous media flow graphics processing unit(GPU) PARALLELIZATION
下载PDF
Real-time Virtual Environment Signal Extraction and Denoising Using Programmable Graphics Hardware
14
作者 Yang Su Zhi-Jie Xu Xiang-Qian Jiang 《International Journal of Automation and computing》 EI 2009年第4期326-334,共9页
The sense of being within a three-dimensional (3D) space and interacting with virtual 3D objects in a computer-generated virtual environment (VE) often requires essential image, vision and sensor signal processing... The sense of being within a three-dimensional (3D) space and interacting with virtual 3D objects in a computer-generated virtual environment (VE) often requires essential image, vision and sensor signal processing techniques such as differentiating and denoising. This paper describes novel implementations of the Gaussian filtering for characteristic signal extraction and waveletbased image denoising algorithms that run on the graphics processing unit (GPU). While significant acceleration over standard CPU implementations is obtained through exploiting data parallelism provided by the modern programmable graphics hardware, the CPU can be freed up to run other computations more efficiently such as artificial intelligence (AI) and physics. The proposed GPU-based Gaussian filtering can extract surface information from a real object and provide its material features for rendering and illumination. The wavelet-based signal denoising for large size digital images realized in this project provided better realism for VE visualization without sacrificing real-time and interactive performances of an application. 展开更多
关键词 Virtual environment graphics processing unit GPU-based Gaussian filtering signal denoising WAVELET
下载PDF
Optimization of a precise integration method for seismic modeling based on graphic processing unit 被引量:2
15
作者 Jingyu Li Genyang Tang Tianyue Hu 《Earthquake Science》 CSCD 2010年第4期387-393,共7页
General purpose graphic processing unit (GPU) calculation technology is gradually widely used in various fields. Its mode of single instruction, multiple threads is capable of seismic numerical simulation which has ... General purpose graphic processing unit (GPU) calculation technology is gradually widely used in various fields. Its mode of single instruction, multiple threads is capable of seismic numerical simulation which has a huge quantity of data and calculation steps. In this study, we introduce a GPU-based parallel calculation method of a precise integration method (PIM) for seismic forward modeling. Compared with CPU single-core calculation, GPU parallel calculating perfectly keeps the features of PIM, which has small bandwidth, high accuracy and capability of modeling complex substructures, and GPU calculation brings high computational efficiency, which means that high-performing GPU parallel calculation can make seismic forward modeling closer to real seismic records. 展开更多
关键词 precise integration method seismic modeling general purpose GPU graphic processing unit
下载PDF
The inversion of density structure by graphic processing unit(GPU) and identification of igneous rocks in Xisha area 被引量:1
16
作者 Lei Yu Jian Zhang +2 位作者 Wei Lin Rongqiang Wei Shiguo Wu 《Earthquake Science》 2014年第1期117-125,共9页
Organic reefs, the targets of deep-water petro- leum exploration, developed widely in Xisha area. However, there are concealed igneous rocks undersea, to which organic rocks have nearly equal wave impedance. So the ig... Organic reefs, the targets of deep-water petro- leum exploration, developed widely in Xisha area. However, there are concealed igneous rocks undersea, to which organic rocks have nearly equal wave impedance. So the igneous rocks have become interference for future explo- ration by having similar seismic reflection characteristics. Yet, the density and magnetism of organic reefs are very different from igneous rocks. It has obvious advantages to identify organic reefs and igneous rocks by gravity and magnetic data. At first, frequency decomposition was applied to the free-air gravity anomaly in Xisha area to obtain the 2D subdivision of the gravity anomaly and magnetic anomaly in the vertical direction. Thus, the dis- tribution of igneous rocks in the horizontal direction can be acquired according to high-frequency field, low-frequency field, and its physical properties. Then, 3D forward model- ing of gravitational field was carried out to establish the density model of this area by reference to physical properties of rocks based on former researches. Furthermore, 3D inversion of gravity anomaly by genetic algorithm method of the graphic processing unit (GPU) parallel processing in Xisha target area was applied, and 3D density structure of this area was obtained. By this way, we can confine the igneous rocks to the certain depth according to the density of the igneous rocks. The frequency decomposition and 3D inversion of gravity anomaly by genetic algorithm method of the GPU parallel processing proved to be a useful method for recognizing igneous rocks to its 3D geological position. So organic reefs and igneous rocks can be identified, which provide a prescient information for further exploration. 展开更多
关键词 Xisha area Organic reefs and igneous rocks -Frequency decomposition of potential field 3D inversionof the graphic processing unit (GPU) parallel processing
下载PDF
Graphic Processing Unit-Accelerated Neural Network Model for Biological Species Recognition
17
作者 温程璐 潘伟 +1 位作者 陈晓熹 祝青园 《Journal of Donghua University(English Edition)》 EI CAS 2012年第1期5-8,共4页
A graphic processing unit (GPU)-accelerated biological species recognition method using partially connected neural evolutionary network model is introduced in this paper. The partial connected neural evolutionary netw... A graphic processing unit (GPU)-accelerated biological species recognition method using partially connected neural evolutionary network model is introduced in this paper. The partial connected neural evolutionary network adopted in the paper can overcome the disadvantage of traditional neural network with small inputs. The whole image is considered as the input of the neural network, so the maximal features can be kept for recognition. To speed up the recognition process of the neural network, a fast implementation of the partially connected neural network was conducted on NVIDIA Tesla C1060 using the NVIDIA compute unified device architecture (CUDA) framework. Image sets of eight biological species were obtained to test the GPU implementation and counterpart serial CPU implementation, and experiment results showed GPU implementation works effectively on both recognition rate and speed, and gained 343 speedup over its counterpart CPU implementation. Comparing to feature-based recognition method on the same recognition task, the method also achieved an acceptable correct rate of 84.6% when testing on eight biological species. 展开更多
关键词 graphic processing unit(GPU) compute unified device architecture (CUDA) neural network species recognition
下载PDF
混沌线程池与GPU优化的批量图像加密算法
18
作者 潘明华 王一涵 +1 位作者 谷盛民 孙绍华 《科学技术与工程》 北大核心 2023年第34期14618-14626,共9页
数据量大且冗余度高是数字图像显著的特征,这对大批量图像快速实时加密提出了挑战。为了解决此问题,基于Lorenz混沌加密技术,设计了一种采用线程池与图形处理器(graphics processing unit,GPU)组合优化的批量图像加密算法。该算法通过... 数据量大且冗余度高是数字图像显著的特征,这对大批量图像快速实时加密提出了挑战。为了解决此问题,基于Lorenz混沌加密技术,设计了一种采用线程池与图形处理器(graphics processing unit,GPU)组合优化的批量图像加密算法。该算法通过线程池改进图像的读写,并进行图像镜像变换;利用Lorenz混沌系统生成加密序列,结合图像分块混沌序列进行加密;然后对批量图像数据进行打包,通过GPU进行大批量的异步计算;最后重组图像矩阵得到批量加密图像。实验测试表明,该算法能够有效抵御常见的攻击手段,经过性能优化后的批量数字图像加密算法,可以保证图像安全性;同时,在批量图像读取速率和加解密处理效率方面有显著的提高。 展开更多
关键词 图像加密 混沌系统 并行计算 线程池 图形处理器(graphics processing unit GPU)
下载PDF
Real-time electroholography using a single spatial light modulator and a cluster of graphics-processing units connected by a gigabit Ethernet network 被引量:3
19
作者 Hiromi Sannomiya Naoki Takada +6 位作者 Tomoya Sakaguchi Hirotaka Nakayama Minoru Oikawa Yuichiro Mori Takashi Kakue Tomoyoshi Shimobaba Tomoyoshi Ito 《Chinese Optics Letters》 SCIE EI CAS CSCD 2020年第2期23-27,共5页
Systems containing multiple graphics-processing-unit(GPU)clusters are difficult to use for real-time electroholography when using only a single spatial light modulator because the transfer of the computer-generated ho... Systems containing multiple graphics-processing-unit(GPU)clusters are difficult to use for real-time electroholography when using only a single spatial light modulator because the transfer of the computer-generated hologram data between the GPUs is bottlenecked.To overcome this bottleneck,we propose a rapid GPU packing scheme that significantly reduces the volume of the required data transfer.The proposed method uses a multi-GPU cluster system connected with a cost-effective gigabit Ethernet network.In tests,we achieved real-time electroholography of a three-dimensional(3D)video presenting a point-cloud 3D object made up of approximately 200,000 points. 展开更多
关键词 real-time electroholography multiple-graphics processing unit cluster graphics processing unit gigabit Ethernet
原文传递
Implementing Delay Multiply and Sum Beamformer on a Hybrid CPU-GPU Platform for Medical Ultrasound Imaging Using Open MP and CUDA 被引量:2
20
作者 Ke Song Paul Liu Dongquan Liu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2021年第9期1133-1150,共18页
Anovel beamforming algorithmnamed Delay Multiply and Sum(DMAS),which excels at enhancing the resolution and contrast of ultrasonic image,has recently been proposed.However,there are nested loops in this algorithm,so t... Anovel beamforming algorithmnamed Delay Multiply and Sum(DMAS),which excels at enhancing the resolution and contrast of ultrasonic image,has recently been proposed.However,there are nested loops in this algorithm,so the calculation complexity is higher compared to the Delay and Sum(DAS)beamformer which is widely used in industry.Thus,we proposed a simple vector-based method to lower its complexity.The key point is to transform the nested loops into several vector operations,which can be efficiently implemented on many parallel platforms,such as Graphics Processing Units(GPUs),and multi-core Central Processing Units(CPUs).Consequently,we considered to implement this algorithm on such a platform.In order to maximize the use of computing power,we use the GPUs andmulti-core CPUs inmixture.The platform used in our test is a low cost Personal Computer(PC),where a GPU and a multi-core CPU are installed.The results show that the hybrid use of a CPU and a GPU can get a significant performance improvement in comparison with using a GPU or using amulti-core CPU alone.The performance of the hybrid system is increased by about 47%–63%compared to a single GPU.When 32 elements are used in receiving,the fame rate basically can reach 30 fps.In the best case,the frame rate can be increased to 40 fps. 展开更多
关键词 BEAMFORMING delay multiply and sum graphics processing unit multi-core central processing unit
下载PDF
上一页 1 2 4 下一页 到第
使用帮助 返回顶部