多核CPU/GPU平台下的集合求交算法

List Intersection Algorithm on Multi-core CPU/GPU Platform

下载PDF

导出

摘要提出一个多核CPU/GPU混合平台下的集合求交算法。针对CPU端求交问题,利用对数据空间局部性和中序求交的思想,给出内向求交算法和Baeza-Yates改进算法,算法速度分别提升0.79倍和1.25倍。在GPU端,提出有效搜索区间思想,通过计算GPU中每个Block在其余列表上的有效搜索区间来缩小搜索范围,进而提升求交速度,速度平均提升40%。在混合平台采用时间隐藏技术将数据预处理和输入输出操作隐藏在GPU计算过程中,结果显示系统平均速度可提升85%。 A list intersection algorithm on Multi-Core CPU/GPU platform is put forward. For CPU, inwards intersection algorithm and refined Baeza-Yates algorithm are proposed, by taking advantage of data locality and in-order intersection strategy. they gain 0.79 and 1.25 times speed up respectively. For GPU, effective search interval thought is proposed. The search range is reduced by computing effective search interval in other lists of each Block, thus enhance the speed of the intersection, which irfiproves the speed of list intersection by 40%. For mixed-platform, the operation of data preprocessing and I/O is hidden by time hiding technology, and the final system has a speed up of about 85%.

作者王怀超赵雷

机构地区苏州大学计算机科学与技术学院

出处《计算机工程》 CAS CSCD 2013年第4期296-299,304,共5页 Computer Engineering

基金国家自然科学基金资助项目(61073061)

关键词集合求交多核CPU GPU求交算法并行算法时间隐藏有效搜索区间 list intersection multi-core CPU GPU intersection algorithm parallel algorithm time hiding valid search range

分类号 TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献10

1吴恩华.图形处理器用于通用计算的技术、现状及其挑战[J].软件学报,2004,15(10):1493-1504. 被引量：141
2Barbay J, L6pez-Ortiz A, Salinger A, et al. An Experi- mental Investigation of Set Intersection Algorithms for Text Searching[J]. Journal of Experimental Algorithmics, 2009, 14(7): 7-24.
3Wu Di, Zhang Fan, Ao Naiyong, et al. Efficient Lists Intersection by CPU-GPU Cooperative Computing[C]// Proc. of International Symposium on Parallel & Distributed Processing. Atlanta, USA: IEEE Press, 2010.
4Ao Naiyong, Zhang Fan, Wu Di, et al. Efficient Parallel Lists Intersection and Index Compression Algorithms Using Graphics Processing Units[J]. Proceedings of the VLDB Endowment, 2011, 4(8): 470-481.
5Bentley J L, Yao A C C. An Almost Optimal Algorithm for Unbounded Searching[J]. Information Processing Letters, 1976, 5(3): 82-87.
6Krauthgamer R, Mehta A, Raman V, et al. Greedy List Intersection[C]//Proc. of the 24th International Conference on Data Engineering. Washington D. C., USA: IEEE Computer Society, 2008:1033-1042.
7Wu Di, Zhang Fan, Ao Naiyong, et al. A Batched GPU Algorithm for Set lntersection[C]//Proc, of the 10th International Symposium on Pervasive Systems, Algo- rithms, and Networks. [S. 1.]: IEEE Press, 2009: 752-756.
8陈伟,杜凌霞,陈红.多核架构下的数据处理算法优化策略综述[J].计算机科学与探索,2011,5(12):1057-1075. 被引量：7
9Yang Canqun, Wang Feng, Du Yunfei, et al. Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing[C]//Proc. of International Conference on Cluster Computing. IS. 1.]: IEEE Press, 2010.
10邹岩,杨志义,张凯龙.CUDA并行程序的内存访问优化技术研究[J].计算机测量与控制,2009,17(12):2504-2506. 被引量：17

二级参考文献7

1吴恩华,柳有权.基于图形处理器(GPU)的通用计算[J].计算机辅助设计与图形学学报,2004,16(5):601-612. 被引量：227
2NVIDIA Corporation. CUDA ProgrammingGuide 2. 0 [EB/OL]. http: //www. nvidia. com, Jun, 2008.
3Luebke D, Humphreys G. How GPUs Work [J]. IEEE Computer, 2007, 40 (2): 96-100.
4Nickolls J, Buck I, Garland M, et al. Scalable Parallel Programming with CUDA [J]. Queue, 2008, 6 (2): 40-53.
5Halfhil T R. Parallel Processing With CUDA [R]. Microprocessor Report, Scottsdale, Arizona, 2008.
6Owens J D, Houston M, Luebke D, et al.. GPU Computing [J]. Proceedings of the IEEE, 2008, 96 (5): 879-897.
7邓亚丹,景宁,熊伟.基于共享Cache多核处理器的Hash连接优化[J].软件学报,2010,21(6):1220-1232. 被引量：4

共引文献162

1刘波,王博亮,谢杰镇.应用于生物膜组织的虚拟手术仿真技术研究[J].中国数字医学,2007,2(11):37-40. 被引量：1
2张军,易成,王邦平,李晓峰.GPU加速的鲁棒性人脸2.5D重建方法[J].四川大学学报（工程科学版）,2009,41(4):155-162.
3刘伟峰,赵改善,孔祥宁,蔡杰雄,张兵.基于多GPU的三维Kirchhoff积分法体偏移[J].华中科技大学学报（自然科学版）,2011,39(S1):110-114.
4刘伟峰,王永胜,张天雷,张兵.使用GPU模拟地震波传播的性能研究[J].系统仿真学报,2009,21(S1):170-174. 被引量：3
5鲍春波,王博亮.基于半边结构的膜组织触觉仿真[J].学术问题研究,2006,0(2):104-109.
6张建勋,刘全利,陈庄.基于可编程GPU的快速体绘制技术[J].重庆大学学报（自然科学版）,2005,28(7):67-70. 被引量：9
7柳有权,刘学慧,吴恩华.基于GPU带有复杂边界的三维实时流体模拟[J].软件学报,2006,17(3):568-576. 被引量：54
8方建文,于金辉,马文龙.图形硬件加速的实时水面绘制[J].计算机工程与应用,2006,42(15):86-88. 被引量：2
9李笑盈,吴恩华.过程性纹理映射的FPGA动态生成[J].计算机辅助设计与图形学学报,2006,18(5):630-637. 被引量：1
10李建明,万单领,迟忠先,胡祥培.一种基于GPU加速的细粒度并行粒子群算法[J].哈尔滨工业大学学报,2006,38(12):2162-2166. 被引量：8

1ABC.做个QQ上的时间骇客[J].电脑爱好者,2004(17):35-35.
2赵卫绩,刘井莲.基于变精度粗糙集模型的流感诊断仿真系统[J].计算机仿真,2011,28(3):234-237.
3王斌,熊志辉,陈立栋,谭树人,张茂军.具有时间隐藏特性的数据块读写SDRAM控制器[J].计算机工程,2009,35(4):244-246. 被引量：3
4徐东华.基于二阶段小波多分辨率分析的掌纹分割算法[J].计算机应用与软件,2012,29(4):263-265.
5幕后黑手——时间隐藏再出新招[J].网友世界,2004(22):30-30.
6王斌,熊志辉,程钢,陈立栋,张茂军.基于FPGA的折反射全景图像查表实时展开[J].计算机应用,2008,28(12):3135-3137. 被引量：2
7刘勇奎.一个快速线裁剪方法[J].计算机工程,1993,19(3):22-26. 被引量：1
8谢易辰,陈健,闫镔,童莉,曾磊,崔明明.三维特征点距离特征集合求交匹配算法[J].红外与激光工程,2014,43(8):2728-2732. 被引量：5
9马天宏,宋建波,毕如林.大容量高清、标清视频混合平台的构建[J].上海船舶运输科学研究所学报,2014,37(4):46-49.
10董宁,徐立新,何超,张宇河.Design and Research of Information Processing System Based on Mix Platform[J].Journal of Beijing Institute of Technology,1999,8(3):294-299.

计算机工程

2013年第4期

浏览历史

内容加载中请稍等...

多核CPU/GPU平台下的集合求交算法

参考文献10

二级参考文献7

共引文献162

相关作者

相关机构

相关主题

浏览历史