期刊文献+

大数据基准测试程序包构建方法研究

An Approach to Build a Big Data Benchmark Suite
下载PDF
导出
摘要 基准测试程序是评估计算机系统的关键测试工具。然而,大数据时代的到来使得开发大数据系统基准测试程序面临着更加严峻的挑战,当前学术界和产业界还不存在得到广泛认可的大数据基准测试程序包。文章利用实际的交通大数据系统构建了一个基于Hadoop平台的交通大数据基准测试程序包SIAT-Bench。通过选取多个层次属性量化了程序行为特征,采用聚类算法分析了不同程序-输入数据集对的相似性。根据聚类结果,为SIATBench选取了有代表性的程序和输入数据集。实验结果表明,SIAT-Bench在满足程序行为多样性的同时消除了基准测试集中的冗余。 Benchmarks are important tools to evaluate the performance of a variety of computing systems. However, benchmarks for big data systems are lacking as big data is relatively new and researchers are interested in understanding how big data systems including hardware and software work but do not have data. In this paper, an approach to develop big data benchmarks was devised at first. Then a big data benchmark suite named SIAT-Bench, which contains five representative workloads from Shenzhen urban transportation system, was presented. To this end, the program behavior was characterized and the impact of input data sets was qualiifed by observing metrics from multiple levels such as microarchitecture, OS and application layer. Then statistical techniques such as Principal Component Analysis (PCA) and Clustering were employed to perform similarity analysis between different workload-input pairs. Finally, we built SIAT-Bench by selecting representative workloads and associated input sets according to the clustering results. Experimental results show that SIAT-Bench properly satisifes the requirements of a benchmark suite.
出处 《集成技术》 2014年第4期1-9,共9页 Journal of Integration Technology
关键词 大数据基准测试程序 输入数据集 程序相似性 城市交通系统 GPS轨迹数据 big data benchmark workload-input pairs similarity urban traffic systems GPS trajectory data
  • 相关文献

参考文献1

二级参考文献167

  • 1Nature. Big Data [EB/OL]. [2012-10-02]. http,//www. nature, com/news/specials/bigdata/index, html.
  • 2Bryant R E, Katz R H, Lazowska E D. Big-Data computing : Creating revolutionary breakthroughs in commerce, science, and society [R]. [2012-10-02]. http:// www. cra. org/ccc/docs/init/Big_Data, pdf.
  • 3Science. Special online collection: Dealing with data [EB/OL]. [2012-10-02]. http://www, sciencemag, org/site/ special/data/, 2011.
  • 4Agrawal D, Bernstein P, Bertino E, et al. Challenges and opportunities with big data A community white paper developed by leading researchers across the United States [R/OL]. [2012-10-02]. http://cra, org/ccc/docs/init/bigdata whitepaper, pdf.
  • 5Manyika J, Chui M, Brown B, et al. Big data: The next frontier for innovation, competition, and productivity [R/OL]. [ 2012-10-02 ]. http://www, mekinsey, corn/ Insights]MGI[Research/Teehnology _ and _ Innovation]Big _ data The next frontier for innovation.
  • 6World Economic Forum. Big data, big impact: New possibilities for international development [R/OL]. [2012- 10-02]. http://www3, weforum, org/docs/WEF TC MFS BigDataBigImpact_Briefing 2012. pdf.
  • 7Big Data Across the Federal Government [EB/OL]. [2012-10-02]. http://www, whitehouse, gov/sites/default/ files/microsites/ostp/big_data fact sheet_final_ 1. pdf.
  • 8UN Global Pulse. Big Data for Development:Challenges Opportunities [R/OL]. [ 2012-10-02 ]. http://www. unglobalpulse, org/proj ects/BigDataforDevelopment.
  • 9Times N Y. The age of big data fEB/OLd. [2012-10 -02]. http://www, nytimes, com/2012/02/12/sunday review/big- datas-impact in-the-world, html?pagewanted=all.
  • 10Grobelnik M. Big-data computing: Creating revolutionary breakthroughs in commerce, science, and society [R/OL]. [2012-10 -02]. http://videolectures, net/cswc2012_grobelnik_ big_data/.

共引文献2370

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部