期刊文献+
共找到109篇文章
< 1 2 6 >
每页显示 20 50 100
Optimization of a precise integration method for seismic modeling based on graphic processing unit 被引量:2
1
作者 Jingyu Li Genyang Tang Tianyue Hu 《Earthquake Science》 CSCD 2010年第4期387-393,共7页
General purpose graphic processing unit (GPU) calculation technology is gradually widely used in various fields. Its mode of single instruction, multiple threads is capable of seismic numerical simulation which has ... General purpose graphic processing unit (GPU) calculation technology is gradually widely used in various fields. Its mode of single instruction, multiple threads is capable of seismic numerical simulation which has a huge quantity of data and calculation steps. In this study, we introduce a GPU-based parallel calculation method of a precise integration method (PIM) for seismic forward modeling. Compared with CPU single-core calculation, GPU parallel calculating perfectly keeps the features of PIM, which has small bandwidth, high accuracy and capability of modeling complex substructures, and GPU calculation brings high computational efficiency, which means that high-performing GPU parallel calculation can make seismic forward modeling closer to real seismic records. 展开更多
关键词 precise integration method seismic modeling general purpose GPU graphic processing unit
下载PDF
TIME-DOMAIN INTERPOLATION ON GRAPHICS PROCESSING UNIT 被引量:1
2
作者 XIQI LI GUOHUA SHI YUDONG ZHANG 《Journal of Innovative Optical Health Sciences》 SCIE EI CAS 2011年第1期89-95,共7页
The signal processing speed of spectral domain optical coherence tomography(SD-OCT)has become a bottleneck in a lot of medical applications.Recently,a time-domain interpolation method was proposed.This method can get ... The signal processing speed of spectral domain optical coherence tomography(SD-OCT)has become a bottleneck in a lot of medical applications.Recently,a time-domain interpolation method was proposed.This method can get better signal-to-noise ratio(SNR)but much-reduced signal processing time in SD-OCT data processing as compared with the commonly used zeropadding interpolation method.Additionally,the resampled data can be obtained by a few data and coefficients in the cutoff window.Thus,a lot of interpolations can be performed simultaneously.So,this interpolation method is suitable for parallel computing.By using graphics processing unit(GPU)and the compute unified device architecture(CUDA)program model,time-domain interpolation can be accelerated significantly.The computing capability can be achieved more than 250,000 A-lines,200,000 A-lines,and 160,000 A-lines in a second for 2,048 pixel OCT when the cutoff length is L=11,L=21,and L=31,respectively.A frame SD-OCT data(400A-lines×2,048 pixel per line)is acquired and processed on GPU in real time.The results show that signal processing time of SD-OCT can befinished in 6.223 ms when the cutoff length L=21,which is much faster than that on central processing unit(CPU).Real-time signal processing of acquired data can be realized. 展开更多
关键词 Optical coherence tomography real-time signal processing graphics processing unit GPU CUDA
下载PDF
The inversion of density structure by graphic processing unit(GPU) and identification of igneous rocks in Xisha area 被引量:1
3
作者 Lei Yu Jian Zhang +2 位作者 Wei Lin Rongqiang Wei Shiguo Wu 《Earthquake Science》 2014年第1期117-125,共9页
Organic reefs, the targets of deep-water petro- leum exploration, developed widely in Xisha area. However, there are concealed igneous rocks undersea, to which organic rocks have nearly equal wave impedance. So the ig... Organic reefs, the targets of deep-water petro- leum exploration, developed widely in Xisha area. However, there are concealed igneous rocks undersea, to which organic rocks have nearly equal wave impedance. So the igneous rocks have become interference for future explo- ration by having similar seismic reflection characteristics. Yet, the density and magnetism of organic reefs are very different from igneous rocks. It has obvious advantages to identify organic reefs and igneous rocks by gravity and magnetic data. At first, frequency decomposition was applied to the free-air gravity anomaly in Xisha area to obtain the 2D subdivision of the gravity anomaly and magnetic anomaly in the vertical direction. Thus, the dis- tribution of igneous rocks in the horizontal direction can be acquired according to high-frequency field, low-frequency field, and its physical properties. Then, 3D forward model- ing of gravitational field was carried out to establish the density model of this area by reference to physical properties of rocks based on former researches. Furthermore, 3D inversion of gravity anomaly by genetic algorithm method of the graphic processing unit (GPU) parallel processing in Xisha target area was applied, and 3D density structure of this area was obtained. By this way, we can confine the igneous rocks to the certain depth according to the density of the igneous rocks. The frequency decomposition and 3D inversion of gravity anomaly by genetic algorithm method of the GPU parallel processing proved to be a useful method for recognizing igneous rocks to its 3D geological position. So organic reefs and igneous rocks can be identified, which provide a prescient information for further exploration. 展开更多
关键词 Xisha area Organic reefs and igneous rocks -Frequency decomposition of potential field 3D inversionof the graphic processing unit (GPU) parallel processing
下载PDF
Simulation of fluid-structure interaction in a microchannel using the lattice Boltzmann method and size-dependent beam element on a graphics processing unit
4
作者 Vahid Esfahanian Esmaeil Dehdashti Amir Mehdi Dehrouye-Semnani 《Chinese Physics B》 SCIE EI CAS CSCD 2014年第8期389-395,共7页
Fluid-structure interaction (FSI) problems in microchannels play a prominent role in many engineering applications. The present study is an effort toward the simulation of flow in microchannel considering FSI. The b... Fluid-structure interaction (FSI) problems in microchannels play a prominent role in many engineering applications. The present study is an effort toward the simulation of flow in microchannel considering FSI. The bottom boundary of the microchannel is simulated by size-dependent beam elements for the finite element method (FEM) based on a modified cou- ple stress theory. The lattice Boltzmann method (LBM) using the D2Q13 LB model is coupled to the FEM in order to solve the fluid part of the FSI problem. Because of the fact that the LBM generally needs only nearest neighbor information, the algorithm is an ideal candidate for parallel computing. The simulations are carried out on graphics processing units (GPUs) using computed unified device architecture (CUDA). In the present study, the governing equations are non-dimensionalized and the set of dimensionless groups is exhibited to show their effects on micro-beam displacement. The numerical results show that the displacements of the micro-beam predicted by the size-dependent beam element are smaller than those by the classical beam element. 展开更多
关键词 fluid-structure interaction graphics processing unit lattice Boltzmann method size-dependentbeam element
下载PDF
Reconfigurable-System-on-Chip Implementation of Data Processing Units for Space Applications
5
作者 张宇宁 常亮 +1 位作者 杨根庆 李华旺 《Transactions of Tianjin University》 EI CAS 2010年第4期270-274,共5页
Application-specific data processing units (DPUs) are commonly adopted for operational control and data processing in space missions. To overcome the limitations of traditional radiation-hardened or fully commercial d... Application-specific data processing units (DPUs) are commonly adopted for operational control and data processing in space missions. To overcome the limitations of traditional radiation-hardened or fully commercial design approaches, a reconfigurable-system-on-chip (RSoC) solution based on state-of-the-art FPGA is introduced. The flexibility and reliability of this approach are outlined, and the requirements for an enhanced RSoC design with in-flight reconfigurability for space applications are presented. This design has been demonstrated as an on-board computer prototype, providing an in-flight reconfigurable DPU design approach using integrated hardwired processors. 展开更多
关键词 RSoC in-flight reconfigurability spaceborne data processing unit
下载PDF
Graphic Processing Unit-Accelerated Mutual Information-Based 3D Image Rigid Registration
6
作者 李冠华 欧宗瑛 +1 位作者 苏铁明 韩军 《Transactions of Tianjin University》 EI CAS 2009年第5期375-380,共6页
Mutual information (MI)-based image registration is effective in registering medical images, but it is computationally expensive. This paper accelerates MI-based image registration by dividing computation of mutual ... Mutual information (MI)-based image registration is effective in registering medical images, but it is computationally expensive. This paper accelerates MI-based image registration by dividing computation of mutual information into spatial transformation and histogram-based calculation, and performing 3D spatial transformation and trilinear interpolation on graphic processing unit (GPU). The 3D floating image is downloaded to GPU as flat 3D texture, and then fetched and interpolated for each new voxel location in fragment shader. The transformed resuits are rendered to textures by using frame buffer object (FBO) extension, and then read to the main memory used for the remaining computation on CPU. Experimental results show that GPU-accelerated method can achieve speedup about an order of magnitude with better registration result compared with the software implementation on a single-core CPU. 展开更多
关键词 image registration mutual information graphic processing unit (GPU)
下载PDF
Acceleration of optical coherence tomography signal processing by multi-graphics processing units
7
作者 Xiqi Li Guohua Shi +2 位作者 Ping Huang Yudong Zhang 《Journal of Innovative Optical Health Sciences》 SCIE EI CAS 2014年第3期67-72,共6页
A multi-GPU system designed for high-speed,real-time signal processing of optical coherencetomography(OCT)is described herein.For the OCT data sampled in linear wave numbers,themaximum procesing rates reached 2.95 MHz... A multi-GPU system designed for high-speed,real-time signal processing of optical coherencetomography(OCT)is described herein.For the OCT data sampled in linear wave numbers,themaximum procesing rates reached 2.95 MHz for 1024-OCT and 1.96 MHz for 2048-OCT.Data sampled using linear wavelengths were re-sampled using a time-domain interpolation method and zero-padding interpolation method to improve image quality.The maximum processing rates for1024-OCT reached 2.16 MHz for the time-domain method and 1.26 MHz for the zero-paddingmethod.The maximum processing rates for 2048-0CT reached_1.58 MHz,and 0.68 MHz,respectively.This method is capable of high-speed,real-time processing for O CT systems. 展开更多
关键词 Optical coherence tomography real time signal processing multigraphics processing units.
下载PDF
Compute Unified Device Architecture Implementation of Euler/Navier-Stokes Solver on Graphics Processing Unit Desktop Platform for 2-D Compressible Flows
8
作者 Zhang Jiale Chen Hongquan 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI CSCD 2016年第5期536-545,共10页
Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/N... Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/Navier-Stokes solver is developed for 2-D compressible flows by using NVIDIA′s Compute Unified Device Architecture(CUDA)programming model in CUDA Fortran programming language.The techniques of implementation of CUDA kernels,double-layered thread hierarchy and variety memory hierarchy are presented to form the GPU-based algorithm of Euler/Navier-Stokes equations.The resulting parallel solver is validated by a set of typical test flow cases.The numerical results show that dozens of times speedup relative to a serial CPU implementation can be achieved using a single GPU desktop platform,which demonstrates that a GPU desktop can serve as a costeffective parallel computing platform to accelerate computational fluid dynamics(CFD)simulations substantially. 展开更多
关键词 graphics processing unit(GPU) GPU parallel computing compute unified device architecture(CUDA)Fortran finite volume method(FVM) acceleration
下载PDF
Graphic Processing Unit-Accelerated Neural Network Model for Biological Species Recognition
9
作者 温程璐 潘伟 +1 位作者 陈晓熹 祝青园 《Journal of Donghua University(English Edition)》 EI CAS 2012年第1期5-8,共4页
A graphic processing unit (GPU)-accelerated biological species recognition method using partially connected neural evolutionary network model is introduced in this paper. The partial connected neural evolutionary netw... A graphic processing unit (GPU)-accelerated biological species recognition method using partially connected neural evolutionary network model is introduced in this paper. The partial connected neural evolutionary network adopted in the paper can overcome the disadvantage of traditional neural network with small inputs. The whole image is considered as the input of the neural network, so the maximal features can be kept for recognition. To speed up the recognition process of the neural network, a fast implementation of the partially connected neural network was conducted on NVIDIA Tesla C1060 using the NVIDIA compute unified device architecture (CUDA) framework. Image sets of eight biological species were obtained to test the GPU implementation and counterpart serial CPU implementation, and experiment results showed GPU implementation works effectively on both recognition rate and speed, and gained 343 speedup over its counterpart CPU implementation. Comparing to feature-based recognition method on the same recognition task, the method also achieved an acceptable correct rate of 84.6% when testing on eight biological species. 展开更多
关键词 graphic processing unit(GPU) compute unified device architecture (CUDA) neural network species recognition
下载PDF
Multi-relaxation-time lattice Boltzmann simulations of lid driven flows using graphics processing unit
10
作者 Chenggong LI J.P.Y.MAA 《Applied Mathematics and Mechanics(English Edition)》 SCIE EI CSCD 2017年第5期707-722,共16页
Large eddy simulation (LES) using the Smagorinsky eddy viscosity model is added to the two-dimensional nine velocity components (D2Q9) lattice Boltzmann equation (LBE) with multi-relaxation-time (MRT) to simul... Large eddy simulation (LES) using the Smagorinsky eddy viscosity model is added to the two-dimensional nine velocity components (D2Q9) lattice Boltzmann equation (LBE) with multi-relaxation-time (MRT) to simulate incompressible turbulent cavity flows with the Reynolds numbers up to 1 × 10^7. To improve the computation efficiency of LBM on the numerical simulations of turbulent flows, the massively parallel computing power from a graphic processing unit (GPU) with a computing unified device architecture (CUDA) is introduced into the MRT-LBE-LES model. The model performs well, compared with the results from others, with an increase of 76 times in computation efficiency. It appears that the higher the Reynolds numbers is, the smaller the Smagorinsky constant should be, if the lattice number is fixed. Also, for a selected high Reynolds number and a selected proper Smagorinsky constant, there is a minimum requirement for the lattice number so that the Smagorinsky eddy viscosity will not be excessively large. 展开更多
关键词 large eddy simulation (LES) multi-relaxation-time (MRT) lattice Boltzmann equation (LBE) two-dimensional nine velocity components (D2Q9) Smagorinskymodel graphic processing unit (GPU) computing unified device architecture (CUDA)
下载PDF
Parallel Image Processing: Taking Grayscale Conversion Using OpenMP as an Example
11
作者 Bayan AlHumaidan Shahad Alghofaily +2 位作者 Maitha Al Qhahtani Sara Oudah Naya Nagy 《Journal of Computer and Communications》 2024年第2期1-10,共10页
In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularl... In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks. 展开更多
关键词 Parallel Computing Image processing OPENMP Parallel Programming High Performance Computing GPU (Graphic processing unit)
下载PDF
Pretreatment of Crude Oil by Ultrasonic-electric United Desalting and Dewatering 被引量:13
12
作者 叶国祥 吕效平 +1 位作者 彭飞 韩萍芳 《Chinese Journal of Chemical Engineering》 SCIE EI CAS CSCD 2008年第4期564-569,共6页
A technology of ultrasonic-electric united desalting and dewatering of crude oil is studied. The ultrasonic setup is designed to form a standing-wave field, which is more efficient for agglomeration of water particles... A technology of ultrasonic-electric united desalting and dewatering of crude oil is studied. The ultrasonic setup is designed to form a standing-wave field, which is more efficient for agglomeration of water particles. The desalting and dewatering results of the ultrasonic-electric united process are compared with those of the electric process. For high salt-contenting crude oil (40-70 mg·L ^-1), the salt content is still above 10.0 mg·L^-1 after crude oil has been treated by two-stage electric desalting process in refinery, which cannot meet the need of refinery. Ultrasonic-electric united process is a novel technology for treating the high salt-contenting oil. On the optimal operating conditions of the ultrasonic-electric united process, the salt content of crude oil can be reduced from 67 5 mg·L^-1 to 3.97 mg·L ^-1 by one-stage ultrasonic-electric united process, and the water content falls below 0.3% (by volume). The results show that the ultrasonic-electric united process is more effective than the electric process in high salt-contenting oil desalting. This technology should be useful in the refinery process. 展开更多
关键词 crude oil ultrasonic-electric united process DESALTING DEWATERING
下载PDF
Design of 8.0Mt/a Atmospheric and Vacuum Distillation Unit
13
作者 Li Hejie(Luoyang Petrochemical Engineering Company ,Luoyang 471003) 《China Petroleum Processing & Petrochemical Technology》 SCIE CAS 2003年第1期35-41,共7页
The design features of 8 Mt/a atmospheric and vacuum distillation unit (Ⅲ) in Zhenhai Refiningand Chemical Company are presented and various process schemes are compared. Production practice hasproved that the main p... The design features of 8 Mt/a atmospheric and vacuum distillation unit (Ⅲ) in Zhenhai Refiningand Chemical Company are presented and various process schemes are compared. Production practice hasproved that the main process design is advanced and reasonable and the process parameters basicallyreached design requirements. 展开更多
关键词 distillation unit atmospheric and vacuum distillation unit DESIGN revamping large-scale process unit imported crude
下载PDF
Joint device architecture algorithm codesign of the photonic neural processing unit
14
作者 Li Pei Zeya Xi +4 位作者 Bing Bai Jianshuai Wang Jingjing Zheng Jing Li Tigang Ning 《Advanced Photonics Nexus》 2023年第3期132-138,共7页
The photonic neural processing unit(PNPU)demonstrates ultrahigh inference speed with low energy consumption,and it has become a promising hardware artificial intelligence(AI)accelerator.However,the nonidealities of th... The photonic neural processing unit(PNPU)demonstrates ultrahigh inference speed with low energy consumption,and it has become a promising hardware artificial intelligence(AI)accelerator.However,the nonidealities of the photonic device and the peripheral circuit make the practical application much more complex.Rather than optimizing the photonic device,the architecture,and the algorithm individually,a joint device-architecture-algorithm codesign method is proposed to improve the accuracy,efficiency and robustness of the PNPU.First,a full-flow simulator for the PNPU is developed from the back end simulator to the high-level training framework;Second,the full system architecture and the complete photonic chip design enable the simulator to closely model the real system;Third,the nonidealities of the photonic chip are evaluated for the PNPU design.The average test accuracy exceeds 98%,and the computing power exceeds 100TOPS. 展开更多
关键词 optics photonics Mach-Zehnder interferometer array photonic neural processing unit design
下载PDF
System Safety Management of Whole Life Cycle of Chemical Process Unit
15
作者 SANG Shansong LIU Xinshang 《International Journal of Plant Engineering and Management》 2020年第1期14-23,共10页
Guaranteeing the safety performance of chemical process units is the premise for the safety production of chemical enterprises.Only to have the system safety management of the whole life cycle of the process units can... Guaranteeing the safety performance of chemical process units is the premise for the safety production of chemical enterprises.Only to have the system safety management of the whole life cycle of the process units can operate the process systems under the state of controllable risk. 展开更多
关键词 whole life cycle of process unit SAFETY MANAGEMENT
下载PDF
Volumetric lattice Boltzmann method for pore-scale mass diffusionadvection process in geopolymer porous structures 被引量:1
16
作者 Xiaoyu Zhang Zirui Mao +6 位作者 Floyd W.Hilty Yulan Li Agnes Grandjean Robert Montgomery Hans-Conrad zur Loye Huidan Yu Shenyang Hu 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第6期2126-2136,共11页
Porous materials present significant advantages for absorbing radioactive isotopes in nuclear waste streams.To improve absorption efficiency in nuclear waste treatment,a thorough understanding of the diffusion-advecti... Porous materials present significant advantages for absorbing radioactive isotopes in nuclear waste streams.To improve absorption efficiency in nuclear waste treatment,a thorough understanding of the diffusion-advection process within porous structures is essential for material design.In this study,we present advancements in the volumetric lattice Boltzmann method(VLBM)for modeling and simulating pore-scale diffusion-advection of radioactive isotopes within geopolymer porous structures.These structures are created using the phase field method(PFM)to precisely control pore architectures.In our VLBM approach,we introduce a concentration field of an isotope seamlessly coupled with the velocity field and solve it by the time evolution of its particle population function.To address the computational intensity inherent in the coupled lattice Boltzmann equations for velocity and concentration fields,we implement graphics processing unit(GPU)parallelization.Validation of the developed model involves examining the flow and diffusion fields in porous structures.Remarkably,good agreement is observed for both the velocity field from VLBM and multiphysics object-oriented simulation environment(MOOSE),and the concentration field from VLBM and the finite difference method(FDM).Furthermore,we investigate the effects of background flow,species diffusivity,and porosity on the diffusion-advection behavior by varying the background flow velocity,diffusion coefficient,and pore volume fraction,respectively.Notably,all three parameters exert an influence on the diffusion-advection process.Increased background flow and diffusivity markedly accelerate the process due to increased advection intensity and enhanced diffusion capability,respectively.Conversely,increasing the porosity has a less significant effect,causing a slight slowdown of the diffusion-advection process due to the expanded pore volume.This comprehensive parametric study provides valuable insights into the kinetics of isotope uptake in porous structures,facilitating the development of porous materials for nuclear waste treatment applications. 展开更多
关键词 Volumetric lattice Boltzmann method(VLBM) Phase field method(PFM) Pore-scale diffusion-advection Nuclear waste treatment Porous media flow Graphics processing unit(GPU) parallelization
下载PDF
MicroMagnetic.jl:A Julia package for micromagnetic and atomistic simulations with GPU support
17
作者 Weiwei Wang Boyao Lyu +2 位作者 Lingyao Kong Hans Fangohr Haifeng Du 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第10期70-79,共10页
MicroMagnetic.jl is an open-source Julia package for micromagnetic and atomistic simulations.Using the features of the Julia programming language,MicroMagnetic.jl supports CPU and various GPU platforms,including NVIDI... MicroMagnetic.jl is an open-source Julia package for micromagnetic and atomistic simulations.Using the features of the Julia programming language,MicroMagnetic.jl supports CPU and various GPU platforms,including NVIDIA,AMD,Intel,and Apple GPUs.Moreover,MicroMagnetic.jl supports Monte Carlo simulations for atomistic models and implements the nudged-elastic-band method for energy barrier computations.With built-in support for double and single precision modes and a design allowing easy extensibility to add new features,MicroMagnetic.jl provides a versatile toolset for researchers in micromagnetics and atomistic simulations. 展开更多
关键词 micromagnetic simulations atomistic simulations graphics processing units
下载PDF
Electromagnetic scattering and imaging simulation of extremely large-scale sea-ship scene based on GPU parallel technology
18
作者 Cheng-Wei Zhang Zhi-Qin Zhao +2 位作者 Wei Yang Li-Lai Zhou Hai-Yu Zhu 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第2期16-23,共8页
Aiming to solve the bottleneck problem of electromagnetic scattering simulation in the scenes of extremely large-scale seas and ships,a high-frequency method by using graphics processing unit(GPU)parallel acceleration... Aiming to solve the bottleneck problem of electromagnetic scattering simulation in the scenes of extremely large-scale seas and ships,a high-frequency method by using graphics processing unit(GPU)parallel acceleration technique is proposed.For the implementation of different electromagnetic methods of physical optics(PO),shooting and bouncing ray(SBR),and physical theory of diffraction(PTD),a parallel computing scheme based on the CPU-GPU parallel computing scheme is realized to balance computing tasks.Finally,a multi-GPU framework is further proposed to solve the computational difficulty caused by the massive number of ray tubes in the ray tracing process.By using the established simulation platform,signals of ships at different seas are simulated and their images are achieved as well.It is shown that the higher sea states degrade the averaged peak signal-to-noise ratio(PSNR)of radar image. 展开更多
关键词 Multi graphics processing unit Radar imaging Sea-ship Shooting and bouncing rays
下载PDF
SOLVERS FOR SYSTEMS OF LARGE SPARSE LINEAR AND NONLINEAR EQUATIONS BASED ON MULTI-GPUS 被引量:3
19
作者 刘沙 钟诚文 陈效鹏 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 2011年第3期300-308,共9页
Numerical treatment of engineering application problems often eventually results in a solution of systems of linear or nonlinear equations.The solution process using digital computational devices usually takes tremend... Numerical treatment of engineering application problems often eventually results in a solution of systems of linear or nonlinear equations.The solution process using digital computational devices usually takes tremendous time due to the extremely large size encountered in most real-world engineering applications.So,practical solvers for systems of linear and nonlinear equations based on multi graphic process units(GPUs)are proposed in order to accelerate the solving process.In the linear and nonlinear solvers,the preconditioned bi-conjugate gradient stable(PBi-CGstab)method and the Inexact Newton method are used to achieve the fast and stable convergence behavior.Multi-GPUs are utilized to obtain more data storage that large size problems need. 展开更多
关键词 general purpose graphic process unit(GPGPU) compute unified device architecture(CUDA) system of linear equations system of nonlinear equations Inexact Newton method bi-conjugate gradient stable(Bi-CGstab)method
下载PDF
Accelerating f inite difference wavef ield-continuation depth migration by GPU
20
作者 刘国峰 孟小红 刘洪 《Applied Geophysics》 SCIE CSCD 2012年第1期41-48,115,共9页
The most popular hardware used for parallel depth migration is the PC-Cluster but its application is limited due to large space occupation and high power consumption. In this paper, we introduce a new hardware archite... The most popular hardware used for parallel depth migration is the PC-Cluster but its application is limited due to large space occupation and high power consumption. In this paper, we introduce a new hardware architecture, based on which the finite difference (FD) wavefield-continuation depth migration can be conducted using the Graphics Processing Unit (GPU) as a CPU coprocessor. We demonstrate the program module and three key optimization steps for implementing FD depth migration: memory, thread structure, and instruction optimizations and consider evaluation methods for the amount of optimization. 2D and 3D models are used to test depth migration on the GPU. The tested results show that the depth migration computational efficiency greatly increased using the general-purpose GPU, increasing by at least 25 times compared to the AMD 2.5 GHz CPU. 展开更多
关键词 Wavefield-continuation depth migration finite difference Graphic processing unit EFFICIENCY
下载PDF
上一页 1 2 6 下一页 到第
使用帮助 返回顶部