Fast division-free parallel structure for convolution perfectly matched layer in finite difference time domain method

Fast division-free parallel structure for convolution perfectly matched layer in finite difference time domain method

导出

摘要 Parallel acceleration of convolution perfectly matched layer （CPML） algorithm suffers from massive division operation which is widely accepted as one of the most expensive operations for the equipment such as graphic processing unit （GPU）, field programmable gate array （FPGA） etc. In pursuit of higher efficiency and lower power consumption, this article revisited the CPML theory and proposed a new fast division-free parallel CPML structure. By optimally rearranging the CPML inner iteration process, all the division operators can be eliminated and replaced by recalculating the related field updating coefficients offline. Experiments show that the proposed division-free structure can save more than 50% arithmetic instructions and 25% execution time of the traditional parallel CPML structure without any accuracy loss. Parallel acceleration of convolution perfectly matched layer （CPML） algorithm suffers from massive division operation which is widely accepted as one of the most expensive operations for the equipment such as graphic processing unit （GPU）, field programmable gate array （FPGA） etc. In pursuit of higher efficiency and lower power consumption, this article revisited the CPML theory and proposed a new fast division-free parallel CPML structure. By optimally rearranging the CPML inner iteration process, all the division operators can be eliminated and replaced by recalculating the related field updating coefficients offline. Experiments show that the proposed division-free structure can save more than 50% arithmetic instructions and 25% execution time of the traditional parallel CPML structure without any accuracy loss.

作者 Bai Bing Niu Zhongqi Niu Yi Wei Bing Zhao Gang

机构地区 School of Electronic Engineering School of Physics and Optoelectronic Engineering

出处《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2015年第1期72-76,82,共6页 中国邮电高校学报（英文版）

基金 sponsored by the National Natural Science Foundation of China (30870577)

关键词 division elimination convolution perfectly matched layer finit difference time domain parallel computing graphic processing unit division elimination, convolution perfectly matched layer, finit difference time domain, parallel computing, graphic processing unit

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1胡媛,李康,孔凡敏,杜刘革.基于CUDA架构的三维CPML-FDTD并行方法[J].计算机工程与应用,2011,47(25):220-223. 被引量：4

二级参考文献10

1李康,孔凡敏,郭毅峰,王俊泉,梅良模.MRTD和高阶FDTD算法的数值色散特性的分析[J].系统仿真学报,2005,17(9):2089-2091. 被引量：12
2葛德彪,杨利霞.各向异性介质FDTD分析及其并行计算[J].系统工程与电子技术,2006,28(4):483-485. 被引量：4
3Yee K S.Numerical solution of initial boundary value problems involving Maxwell's equations in isotropic media[J].IEEE Trans on Antennas and Propagation, 1996, 14: 302-307.
4Adams S, Payne J, Boppana R.Finite Difference Time Domain (FDTD) simulations using graphics processors[C]//2007 DoD High Performance Computing Modernization Program Users Group Conference, Pittsburgh, 2007: 334-338.
5Valcarce A,De La Roche G,Jie Z.A GPU approach to FDTD for radio coverage prediction[C]//11th IEEE Singapore Interna- tional Conference on Communication Systems,Guangzhou,2008: 1585-1590.
6Roden J, Gedney S D.Convolution PML (CPML) :an efficient FDTD implementation of the CFS-PML for arbitrary medium[J]. Microwave and Optical Technology Letters, 2000,27: 334-339.
7Gandey S D.An anisotropic perfectly matched layer-absorbing medium for the truncation of FDTD lattices[J].IEEE Transac- tions on Antennas and Propagation, 1996,44(12) : 1630-1639.
8Nvidia Corporation Technical Staff.NVIDIA CUDA program- ming guide 2.0[M].[S.l.] : NVIDIA Corporation, 2008 : 13-71.
9Inman M J, Elsherbeni A Z.Programming video cards for com- putational electromagnetic applications[J].Antermas and Propaga- tion Magazine,IEEE,2005,47(6) : 71-78.
10Du Liuge,Li Kang,Kong Fanmin.Parallel 3D finite difference time domain simulations on graphics processors with cuda[C]// Proceedings of the Computational Intelligence and Software Engineering, Wuhan, 2009:1-4.

共引文献3

1尤双双,谢杰文,彭谷香.基于CPML的FDTD法矿山地质灾害应急数值模拟[J].世界有色金属,2019,44(12):115-118.
2邵宗有,王昭顺,刘新春.基于CPU-GPU异构机群的FDTD并行算法加速研究[J].系统仿真学报,2013,25(2):235-240. 被引量：1
3白冰,牛中奇.时域有限差分法中的GPU加速高效CPML方案[J].西安电子科技大学学报,2015,42(1):194-199.

1白冰,牛中奇.时域有限差分法中的GPU加速高效CPML方案[J].西安电子科技大学学报,2015,42(1):194-199.
2MUXiaodong,LIUHuiping,WANGHongbin.Application of image convolution to extract the urban extent[J].遥感学报,2011,15(6):1289-1300. 被引量：2
3刘兰冬,苏新卫,蒙杨.一种求多项式方程根的参数并行加速迭代法[J].大学数学,2009,25(4):109-112.
4刘广东.Davidson-Cole色散媒质的CPML吸收边界[J].阜阳师范学院学报（自然科学版）,2014,31(4):40-43. 被引量：1
5姚向东.并行算法到并行结构的映射[J].教学与科技,1994(2):20-26.
6Jianming Xia,Demin Wei.GPU Accelerated Computation for Natural Frequencies of Structures[J].通讯和计算机（中英文版）,2010,7(6):10-13. 被引量：1
7李文志,张宝琳.一种求解扩散方程的混合格式[J].应用数学,1996,9(3):378-381. 被引量：2
8代健,褚天舒,杨照.基于OpenCL的GPU加速三维时域有限差分电磁场仿真算法研究[J].数值计算与计算机应用,2014,35(1):8-20. 被引量：2
9任胜寒,陈雪利,曹旭,朱守平,梁继民.GPU accelerated simplified harmonic spherical approximation equations for three-dimensional optical imaging[J].Chinese Optics Letters,2016,14(7):80-84.
10常钦,高银浩.Effects of a Family Non-universal Z' Boson in B_d → μ^+μ^-and B^-→ π^-μ^+μ^-,ρ^-μ^+μ^-Decays[J].Communications in Theoretical Physics,2012,57(2):234-240.

The Journal of China Universities of Posts and Telecommunications

2015年第1期

浏览历史

内容加载中请稍等...

Fast division-free parallel structure for convolution perfectly matched layer in finite difference time domain method

参考文献1

二级参考文献10

共引文献3

相关作者

相关机构

相关主题

浏览历史