期刊文献+

数据管理系统评测基准:从传统数据库到新兴大数据 被引量:68

Benchmarking Data Management Systems:From Traditional Database to Emergent Big Data
下载PDF
导出
摘要 大数据时代的到来意味着新技术、新系统和新产品的出现.如何客观地比较和评价不同系统之间的优劣自然成为一个热门研究课题,这种情形与三十多年前数据库系统蓬勃发展时期甚为相似.众所周知,在数据库系统取得辉煌成就的发展道路上,基准评测研究一直扮演着重要角色,极大推进了数据库技术和系统的长足发展.数据管理系统评测基准是指一套可用于评测、比较不同数据库系统性能的规范,以客观、全面反映具有类似功能的数据库系统之间的性能差距,从而推动技术进步、引导行业健康发展.数据管理系统评测基准与应用息息相关:应用发展产生新的数据管理需求,继而引发数据管理技术革新,再催生多个数据管理系统/平台,进而产生新的数据管理系统评测基准.数据管理系统评测基准种类多样,不仅包括面向关系型数据的基准评测,还包括面向半结构化数据、对象数据、流数据、空间数据等非关系型数据的评测基准.在当今新的数据系统发展中,面向大数据管理系统的评测基准的研究热潮也如期而至.大数据评测基准研究与应用密切相关.总体而言,尽管已有的数据管理系统评测基准未能充分体现大数据的特征,但是从方法学层面而言,三十多年来数据管理系统评测基准的发展经验是开展大数据系统研发最值得借鉴和参考的,这也是该文的主要动机.该文系统地回顾了数据管理系统评测基准的发展历程,分析了取得的成就,并展望了未来的发展方向. The arrival of big data era means the emergence of novel techniques,systems and products.How to compare and evaluate different database systems objectively becomes a hot research area,which is similar to the age when database systems were just flourishing thirty years ago.As well as we know,database benchmarking plays an important role in the development of database systems,and greatly promotes the development of database technology and systems.The database benchmark refers to a set of specifications to evaluate and compare different database systems,which is capable of reflecting the performance gap between various database systems objectively and comprehensively,so as to promote technological progress and guide the positive development of the industry.Database benchmark is closely related to the application developments:it describes new data management needs,sparks innovative data management theory,gives birth to new data management systems,and ultimately needs to develop appropriatebenchmarks for evaluation.There exist various kinds of database benchmarks,including that for relational databases,for non-relational databases(semi-structured data,object-oriented data,streaming data,and spatial data),and for big data most recently.Nowadays,the tide of the research on big data benchmarking is also coming.The research on big data is strongly related to application requirements.So far,existing work cannot fully reflects the distinctive characteristics of big data applications.From a technical point of view,the developments of database benchmarks in the past thirty years are of great help to develop big data benchmarks,which is the main motivation of this paper.This paper reviews the progress of database benchmarks systematically,and points out future directions.
出处 《计算机学报》 EI CSCD 北大核心 2015年第1期18-34,共17页 Chinese Journal of Computers
基金 国家"九七三"重点基础研究发展规划项目基金(2012CB316203) 国家自然科学基金(61432006 61370101 61321064) 上海市教委科研创新重点项目(14ZZ045)资助
关键词 评测基准 大数据 数据生成器 度量 工作负载 benchmark big data data generator metric workload
  • 相关文献

参考文献4

二级参考文献184

  • 1Chang F, Dean J, Ghemawat S, et al. Bigtable: a distributed storage system for structured data[C]//Proceedings of the 7th Symposium on Operating Systems Design and Imple- mentation (OSDI '06)--Volume 7, Seattle, WA, USA, Nov 6-8, 2006. Berkeley, CA, USA: USENIX Association, 2006: 15.
  • 2Cooper B F, Ramakrishnan R, Srivastava U, et al. PNUTS: Yahoo!' s hosted data serving platform[J]. Proceedings of the VLDB Endowment, 2008, 1(2): 1277-1288.
  • 3Carey M J, DeWitt D J, Kant C, et al. A status report on the 007 OODBMS benchmarking effort[C]//Proceedings of the 9th Annual Conference on Object-Oriented Programming Systems, Language and Applications (OOPSLA '94), Port- land, USA, Oct 23-27, 1994. New York, NY, USA: ACM, 1994: 414-426.
  • 4Cooper B F, Silberstein A, Tam E, et al. Benehmarking cloud serving systems with YCSB[C]//Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC '10), Indiana, USA, 2010. New York, NY, USA: ACM, 2010: 143-154,.
  • 5Pavlo A, Paulson E, Rasin A, et al. A comparison of ap- proaches to large-scale data analysis[C]//Proceedings of the 35th SIGMOD International Conference on Manage- ment of Data (SIGMOD '09), Providence, Rhode Island, USA, 2009. New York, NY, USA: ACM, 2009: 165-178.
  • 6Shi Yingjie, Meng Xiaofeng, Zhao Jing, et al. Benchmarking cloud-based data management systems[C]//Proceedings ofthe 2nd International Workshop on Cloud Data Management (CloudDB '10), Toronto, Canada, Oct 10-13, 2010. New York, NY, USA: ACM, 2010: 47-54.
  • 7Thusoo A, Sarma S, Jain N, et al. Hive: a warehousing solu- tion over a MapReduce framework[J]. Proceedings of the VLDB Endowment, 2009, 2(2): 1626-1629.
  • 8Fox A, Brewer E A. Harvest, yield, and scalable tolerant systems[C]//Proceedings of the 7th Workshop on Hot Topics in Operating Systems (HOTOS '99), New Jersey, Mar 29-30, 1999. Washington, DC, USA: IEEE Computer Society, 1999: 174-178.
  • 9刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述[J].计算机学报,2007,30(9):1475-1489. 被引量:136
  • 10Nature. Big Data [EB/OL]. [2012-10-02]. http,//www. nature, com/news/specials/bigdata/index, html.

共引文献2464

同被引文献478

引证文献68

二级引证文献339

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部