期刊文献+

基于Fermi架构的Join算法 被引量:1

Join Algorithms Based on Fermi Architecture
下载PDF
导出
摘要 在列数据库中,连接操作依然是最核心和最耗时的操作,GPU强大的计算能力可为此提供新的优化手段。基于Fermi架构,提出了新的Hash Join算法和Sort-merge Join算法,其基本思想是充分利用该架构新增的缓存结构来减少连接操作的cache缺失率。与CUDA stream技术相结合,新算法在输出结果较多时可以有效地隐藏主存与显存间数据传输带来的延迟,进一步提升其执行效率。实验结果证实了基于Fermi架构的Hash Join算法处理偏斜数据的高效性及Sort-merge Join算法的稳定性,并且通过比较表明,这两种算法的性能全面优于基于多核CPU充分优化的Join算法,最大加速2.4倍,在外键分布高偏斜时新的Hash Join算法的执行速度甚至达到每秒217M元组。 In column database, the join operation is still the most important and the most time-consuming operatton. GPUs' powerful computing capabilities can provide new optimizing solutions. This paper proposed a new Hash Join al- gorithm and a new Sort-merge Join which are both based on Fermi architecture. These two new implementation approa- ches take full advantage of the new parallel cache hierarchy of Fermi, and successfully reduce the cache miss rate of the join operation. Combined with CUDA stream technology, the new ioin algorithms can effectively hide the data transfer delay between the main memory and glohal memory, when the join operation generates a large number of results. The experimental results show that Hash Join based on Fermi performs better when it deals with data skew, while Sort- merge Join based on Fermi is more stable. Comparing with the implementations based on multi-core CPUs, these two al- gorithms are faster, speed up 2. 4 times at most, and the new Hash Join even achieves 217M tuples per second when fo- reign key's dataset is high skew.
出处 《计算机科学》 CSCD 北大核心 2013年第3期62-67,共6页 Computer Science
基金 广东省科技计划项目(2011A010801008 2011A090200122 2011A090200027)资助
关键词 JOIN算法 Fermi架构 缓存 CUDA STREAM Join algorithm, Fermi architecture, Cache, CUDA stream
  • 相关文献

参考文献12

  • 1He B,Lu M,Yang K,et al.Relational Query Coprocessing on Graphics Processors[J].ACM Transactions on Database Systems,2009,34 (4):23-32.
  • 2Kim C,Kaldewey T,Lee V W,et al.Sort vs.Hash Revisited:Fast Join Implementation on Modem Multi-core CPUs[C] //Proceedings of the VLDB Endowment.US:VLDB Endowment,2009:1378-1389.
  • 3Blanas S,Li Y,Patel J M.Design and Evaluation of Main Memory Hash Join Algorithms for Multi-core CPUs[C] //Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data.NY USA:ACM,2011:37-48.
  • 4陈虎,欧彦麟,陈海波.面向多核处理器平台的Hash Join算法设计与实现[J].计算机研究与发展,2010,47(z1):171-175.
  • 5韩希先,杨东华,李建中.DBCC-Join:一种新的高速缓存敏感的磁盘连接算法[J].计算机学报,2010,33(8):1500-1511. 被引量:4
  • 6Satish N,Kim C,Chhugani J,et al.Fast Sort on CPUs and GPUs:A Case for Bandwidth Oblivious SIMD Sort[C] // Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data.NY USA:ACM,2010:351-362.
  • 7Satish N,Harris M,Garland M.Designing Efficient Sorting Algorithms for Manycore GPUs[C] // Proceedings of the 2009IEEE International Parallel and Distributed Processing Symposium.US:IEEE Computer Society,2009:1-10.
  • 8Chhugani J,Nguyen A D,Lee,V W,et al.Efficient Implementation of Sorting on Multi-core SIMD CPU Architecture[C] //Proceedings of the VLDB Endowment.US:VLDB Endowment,2008:1313-1324.
  • 9Senggupta P,Harris P,Zhang P,et al.Scan Primitives for GPU Computing[C] //Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics hardware.Switzerland:Eurographics Association,2007:97-106.
  • 10Gray J,Sundaresan P,Englert S,et al.Quickly Generating Billion-record Synthetic Databases[C] //Proceedings of the 1994ACM SIGMOD International Conference on Management of Data.NY USA:ACM Press,1994:243-252.

二级参考文献20

  • 1Todd Jobson's Blog Reflections.Santa Clara,CA,USA:Sun Microsystems,2007.
  • 2Shatdal Ambuj,Kant Chander,Naughton Jeffrey F.Cache conscious algorithms for relational query processing//Proceedings of the 20th International Conference on Very Large Data Bases(VLDB'94).Santiago de Chile,Chile,Morgan Kaufmann,1994:510-521.
  • 3Mishra Priti,Eich Margaret H.Join processing in relational databases.ACM Computing Surveys,1992,24(1):63-113.
  • 4Boncz Peter A,Manegold Stefan,Kersten Martin L.Database architecture optimized for the new bottleneck:Memory access//Proceedings of the 25th International Conference on Very Large Data Bases(VLDB'99).Edinburgh,Scotland,UK,Morgan Kaufmann,1999:54-65.
  • 5Ailamaki Anastassia,DeWitt David J,Hill Mark D,Wood David A.DBMSs on a modern processor:Where does time go?//Proceedings of the 25th International Conference on Very Large Data Bases(VLDB'99).Edinburgh,Scotland,UK,Morgan Kaufmann,1999:266-277.
  • 6Manegold Stefan,Boncz Peter A,Kersten Martin L.What happens during a join? Dissecting CPU and memory optimization effect//Proceedings of the 26th International Conference on Very Large Data Bases(VLDB'00).Cairo,Egypt,Morgan Kaufmann,2000:339-350.
  • 7Stonebraker Mike,Abadi Daniel J,Batkin Adam et al.C-Store:A colmn-oriented DBMS//Proceedings of the 31st International Conference on Very Large Data Bases (VLDB'05).Trondheim,Norway,ACM,2005:553-564.
  • 8Han Xixian,Yang Donghua,Li Jianzhong.DBCC-Join:A novel cache-conscious disk-based join algorithm.Harbin Institute of Technology,Harbin:Technical Report DBTR-1002,2010.
  • 9He Bingsheng,Luo Qiong.Cache-oblivious nested-loop joins//Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management(CIKM'06).Arlington,Virginia,USA,ACM,2006:718-727.
  • 10He Bingsheng,Luo Qiong.Cache-oblivious query processing//Proceedings of th 3rd Biennial Conference on Innovative Data Systems Research(CIDR'07).Asilomar,CA,USA,2007:44-55.

共引文献3

同被引文献20

  • 1Myers D C.On the Use of NAND Flash Memory in High-Performance Relational Databases[D].Massachusetts Institute of Technology,2008.
  • 2Wu X,Qiu S,Narasimha Reddy A L.SCMFS:A File System for Storage Class Memory and its Extensions[J].ACM Transactions on Storage (TOS),2013,9(3):1-11.
  • 3Condit J,Nightingale E B,Frost C,et al.Better I/O through byte-addressable,persistent memory[C]∥Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles.ACM,2009:133-146.
  • 4Dulloor S R,Kumar S,Keshavamurthy A,et al.System software for persistent memory[C]∥Proceedings of the Ninth European Conference on Computer Systems.ACM,2014:1-15.
  • 5Sha E H M,Chen Xian-zhang,Zhuge Qing-feng,et al.Designing an efficient persistent in-memory file system.http://cacs.cqu.edu.cn/wp-content/uploads/2015/02/TR-2014-02-Designing-an-efficient-persistent-in-memory-file-system.pdf.
  • 6Ext4 wiki.https://ext4.wiki.kernel.org.
  • 7Raoux S,Burr G W,Breitwisch M J,et al.Phase-change random access memory:A scalable technology[J].IBM Journal of Research and Development,2008,52(4/5):465-479.
  • 8Jung M,Shalf J,Kandemir M.Design of a large-scale storage-class RRAM system[C]∥Eugene,Oregon,USA:Proceedings of the 27th International ACM Conference on International Conference on Super Computing.ACM,2013:103-114.
  • 9Chua L.Resistance switching memories are memristors[J].Applied Physics A,2011,102(4):765-783.
  • 10Kerman J.Toward a Universal Memory[J].Science,2005,308(5721):508-510.

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部