基于CUDA的并行粒子群优化算法研究及实现被引量：6

Research and Design of Parallel Particle Swarm Optimization Algorithm Based on CUDA

下载PDF

导出

摘要应用图形处理器(GPU)来加速粒子群优化(PSO)算法并行计算时,为突出其加速性能,经常有文献以恶化CPU端PSO算法性能为代价。为了科学比较GPU-PSO算法和CPU-PSO算法的性能,提出用"有效加速比"作为算法的性能指标。文中给出的评价方法不需要CPU和GPU端粒子数相同,将GPU并行算法与最优CPU串行算法的性能作比较,以加速收敛到目标精度为准则,在统一计算设备架构(CUDA)下对多个基准测试函数进行了数值仿真实验。结果表明,在GPU上大幅增加粒子数能够加速PSO算法收敛到目标精度,与CPU-PSO相比,获得了10倍以上的"有效加速比"。 In the application of graphic processing unit （GPU） to accelerate particle swarm optimization （PSO） algorithm for parallel computing,many references worsen the performance of PSO algorithm on CPU side in order to highlight the acceleration performance. The concept of ＂Effective Speedup＂ was proposed in this paper to measure the achievement of GPU-PSO algorithm and CPU-PSO algorithm. The proposed method aims at accelerating the implementation to the target precision. The GPU parallel algorithm was compared with the best CPU serial algorithm, which does not require the same number of particles between CPU side and GPU side. Experiments based on several benchmark test functions using compute unified device architecture （CUDA） show that substantially increasing the number of particles on GPU side can significantly accelerate the accomplishment of PSO algorithm to the target precision. Compared with CPU-PSO, an ＂Effective Speedup＂ of more than 10 has been achieved.

作者陈风田雨波杨敏

机构地区江苏科技大学电子信息学院

出处《计算机科学》 CSCD 北大核心 2014年第9期263-268,共6页 Computer Science

基金船舶工业国防科技预研基金项目(10J3.5.2)资助

关键词粒子群优化并行计算图形处理器统一计算设备架构 Particle swarm optimization （PSO） Parallel computing Graphic processing unit （GPU） Compute unified device architecture （CUDA）

分类号 TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献19

1Kennedy J,Eberhart R. Particle swarm optimization [C]//Pro- ceedings of the IEEE International Conference on Neural Net- works. Perth,WA, 1995,4 ; 1942-1948.
2Poll R, Kennedy J, Blackwell T. Particle swarm optimization: an overview [J]. Swarm Intelligence, 2007,1 (1) : 33-57.
3Singhal G, Jain A, Patnaik A. Parallelization of particle swarm optimization using message passing interfaces (MPIs) [C]// IEEE World Congress on Nature & Biologically Inspired Com- puting. Coimbatore, 2009 : 67-71.
4Deep K, Sharma S, Pant M. Modified parallel particle swarm op-timization for global optimization using Message Passing Inter- face [C]//2010 1EEE Fifth International Conference on Bio-In- spired Computing: Theories and Applications. Changsha, 2010: 1451-1458.
5Wang 13 Z, Wu C H, et al. Parallel multi-population Particle Swarm Optimization Algorithm for the Uncapacitated Facility Location problem using OpenMP [C]//IEEE Congress on Evo- lutionary Computation. HK, 2008 : 1214-1218.
6Venayagamoorthy G K, Gudise V G. Swarm intelligence for dig- ital circuits implementation on field programmable gate arrays platforms [C] //Proceedings of the IEEE Conference on Evolvable Hardware. 2004 : 83-86.
7Maeda Y, Matsushita N. Simultaneous Perturbation Particle Swarm Optimization Using FPGA [C]//IEEE International Joint Conference on Neural Networks. Orlando, FL, 2007: 2695- 2700.
8Veronese L,Krohling R. Swarm's flight:Accelerating the parti- cles using C-CUDA [C]//Proceedings of the IEEE Congress on Evolutionary Computation. Trondheim, 2009 : 3264-3270.
9Calazan R M, Nedjah N, de Macedo Mourelle L. Parallel GPU- based implementation of high dimension Particle Swarm Optimi- zations [C]//2013 IEEE Fourth Latin American Symposium on Circuits and Systems. Cusco, 2013:1-4.
10Zhou Y, Tan Y. GPU-based parallel particle swarm optimization [C]//Proceedings of the IEEE Congress on Evolutionary Com- putation. Trondheim, 2009 : 1493-1500.

二级参考文献45

1张蕾,杨波.并行粒子群优化算法的设计与实现[J].通信学报,2005,26(B01):289-292. 被引量：9
2李建明,万单领,迟忠先,胡祥培.一种基于GPU加速的细粒度并行粒子群算法[J].哈尔滨工业大学学报,2006,38(12):2162-2166. 被引量：8
3朱丽莉,杨志鹏,袁华.粒子群优化算法分析及研究进展[J].计算机工程与应用,2007,43(5):24-27. 被引量：57
4玄光南程润伟.遗传算法与工程设计[M].北京：科学出版社,2000..
5KENNEDY J, EBERHART R. Particle Swarm optimization [ C l// Proc of IEEE International Conference on Neural Networks. 1995: 1942-1948.
6TEWOLDE G S, HANNA D M, HASKELL R E. Multi-swarm paral- lel PSO: hardware implementation[ C]//Proc of IEEE Swarm Intelli- gence Symposium. 2009 : 60- 66.
7NVIDIA. NVIDIA CUDA C programming guide: version 3.2 [ EB/ OL]. (2010-01). http://ww, nvidia, com/object/euda_home, html.
8HARADA T. Real-time rigid body simulation on GPUs[ M]//Hubert Nguyen, GPU Gems 3. Boston: Addison-Wesley Professional, 2007 : 611-632.
9SUSSMAN M, CRUTCHFIELD W, PAPAKIPOS M. Pseudorandom number generation on the GPU [ C ]//Proc of the 21st ACM SIG- GRAPH/Eurographics Symposium on Graphics Hardware. New York : ACM Press ,2006:87-94.
10谢金星;邢文训.现代优化算法[M]北京:清华大学出版社,2005.

共引文献38

1田雨波,陈风.基于显卡的微带天线谐振频率神经网络建模[J].电波科学学报,2015,30(1):71-77.
2许亮,王震.基于CUDA的快速大整数乘法[J].计算机工程与应用,2013,49(16):221-224. 被引量：3
3邹航,王华秋,黄勇.基于GPU加速的彩虹表分析MD5哈希密码[J].重庆理工大学学报（自然科学）,2013,27(7):61-66. 被引量：2
4马韬,陈明生,吴先良,刘艺,齐琪.基于GPU加速的高阶矩量法研究与应用[J].微波学报,2013,29(4):34-37. 被引量：2
5刘艺,陈明生,吴先良,齐琦.应用GPU加速结合压缩传感技术解宽角度电磁散射问题[J].合肥师范学院学报,2013,31(6):29-32.
6李晔,于双元,罗四维.基于改进的压入与重标记算法的图割在GPU上的实现[J].计算机科学,2014,41(1):64-68. 被引量：1
7张科,高晓智.不同拓扑结构的并行粒子群优化算法的实现[J].微型机与应用,2014,33(11):71-74.
8余莹,李肯立.多核CPU-GPU协同的并行深度优先算法[J].计算机应用研究,2014,31(10):2982-2985. 被引量：2
9周文刚,毋红军,孙挺.基于并行粒子群优化算法的蛋白质二级结构预测[J].周口师范学院学报,2014,31(5):109-113.
10陈风,田雨波,杨敏.基于GPU的并行粒子群神经网络设计与实现[J].计算机工程与设计,2014,35(11):3967-3973.

同被引文献35

1吴恩华,柳有权.基于图形处理器(GPU)的通用计算[J].计算机辅助设计与图形学学报,2004,16(5):601-612. 被引量：227
2Lindop J E,Treece G M,Gee A H, et al. 3D Elasto- graphy Using Freehand Ultrasound [ J ]. Ultrasound in Medicine & Biology ,2006,32(4 ) :529-545.
3Shiina T,Nitta N, Sjsum E U, et al. Real Time Tissue Elasticity Imaging Using the Combined Autocorrelation Method [ J ]. Journal of Medical Ultrasonics, 2002, 29(3) :119-128.
4Zhou Yongjin, Zheng Yongping. A Motion Estimation Refinement Framework for Real-time Tissue Axial Strain Estimation with Freehand Ultrasound [ J ]. 1EEE Tran- sactions on Ultrasonics, Ferroelectrics and Frequency Control, 2010,57 ( 9 ) : 1943-1951.
5Rivaz H, Boctor E, Foroughi P, et al. Ultrasound Elasto- graphy : A Dynamic Programming Approach [ J]. IEEE Transactionson Medical Imaging, 2008,27 ( 10 ) : 1373- 1377.
6Zahiri A R. Salcudean S E. sound Images Using Time Motion Estimation in Ultra- Domain Cross Correlation with Prior Estimates [ J ]. IEEE Transactions on Bio- medical Engineering, 2006,53 ( 10 ) : 1990-2000.
7Hoyt K, Forsberg F, Ophir J. Comparison of Shift Estimation Strategies in Spectral Elastography[ J ]. Ultra- sonics,2006,44(1 ) :99-108.
8Kennedy J, Kennedy J F, Eberhart R C. Swarm Intelligence [ M ][ S. I. ] : Morgan Kaufmann, 2001.
9Rivaz H, Boctor E M, Choti M A, et al. Real-time Regularized Ultrasound Elastography [ J ]. IEEE Tran- sactions on Medical Imaging, 2011,30 ( 4 ) : 928-945.
10Spears W M,Green D T, Spears D F. Biases in Particle Swarm Optimization[ J]. International Journal of Swarm Intelligence Research ,2010,2 ( 1 ) :34-57.

引证文献6

1红雪.悲壮苍凉大气磅礴:浅谈王云的石油诗[J].岁月,2000(7):59-59.
2杨先凤,李映洁,赖俊良,彭博.基于GPU并行粒子群优化的超声弹性实时成像算法[J].计算机工程,2015,41(12):220-225. 被引量：1
3张兰.基于CUDA的边界变异量子粒子群优化算法[J].数学的实践与认识,2016,46(6):204-212. 被引量：1
4张硕,何发智,周毅,鄢小虎.基于自适应线程束的GPU并行粒子群优化算法[J].计算机应用,2016,36(12):3274-3279. 被引量：2
5李繁,严星,张晓宇.基于GPU的特征脸算法优化研究[J].计算机科学,2021,48(4):197-204.
6符锡成.图形处理器维度层并行粒子群优化算法[J].信息与电脑,2019,0(11):33-36.

二级引证文献4

1华敏,李响.基于近邻刺激的改进粒子群优化算法[J].数学的实践与认识,2018,48(1):199-206. 被引量：8
2曹洁,黄开杰,王进花.GPU加速的差分进化粒子滤波算法[J].计算机应用研究,2018,35(7):1965-1969. 被引量：4
3张利娟,仇建伟,杜登崇,王鑫.基于Spark和PSO算法的军事物流配送路径优化问题研究[J].计算机与现代化,2018(11):65-68. 被引量：3
4赵健,李清,蒋元春,付倩倩,韩佶姮.阴道超声弹性成像图像处理方法研究[J].中国医学装备,2021,18(4):17-20. 被引量：1

1贾志成,张希晋,陈雷,郭艳菊.基于并行粒子群优化的三维点云配准算法[J].电视技术,2016,40(1):36-41. 被引量：2
2康军广,段国林,王金敏,田永军.基于OpenMP的并行粒子群优化算法研究[J].河北工业大学学报,2015,44(2):34-37. 被引量：1
3张捷,封俊红.结合K-means的并行粒子群优化[J].计算机工程与应用,2011,47(19):60-62.
4马文海,胡平.基于概率的并行粒子群AKO-RVM入侵检测[J].电子技术应用,2016,42(11):119-121. 被引量：2
5于振民.应用图形数据库提高CAD系统自动化[J].铁道机车车辆工人,1994(9):18-20.
6张玉娟.应用图形模式识别技术获取图形中的采样点[J].消费电子,2013(18):102-102.
7王云霞,邱胜海,王志亮,高成冲.图形数据库技术在云制造资源建模中的研究与应用[J].机械设计与制造,2015(8):235-237. 被引量：2
8王玉刚.应用图形的旋转变换巧解“难题”[J].数理化学习,2013(7):6-7.
9许廷发,赵思宏,周生兵,倪国强.DSP并行系统的并行粒子群优化目标跟踪[J].光学精密工程,2009,17(9):2236-2240. 被引量：11
10龚成龙,鱼瑞文,马超.一种适用于智能仪器的微型记录接口电路设计[J].仪表技术,2000(3):44-47.

计算机科学

2014年第9期

浏览历史

内容加载中请稍等...

基于CUDA的并行粒子群优化算法研究及实现被引量：6

参考文献19

二级参考文献45

共引文献38

同被引文献35

引证文献6

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

基于CUDA的并行粒子群优化算法研究及实现 被引量：6

参考文献19

二级参考文献45

共引文献38

同被引文献35

引证文献6

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

基于CUDA的并行粒子群优化算法研究及实现被引量：6