期刊文献+

云计算环境下科学工作流中的数据管理研究

A Research on Data Management for Science Workflow in Cloud
原文传递
导出
摘要 大数据时代,科学家必须使用多个科学工作流管理系统协同完成一项大型实验,来自不同环境和不同科学工作流管理系统的数据构成了科学大数据,科学大数据的产生为科学工作流管理系统中的数据管理带来挑战。科学工作流一般由若干个任务构成,这些任务对输入数据进行运算以产生新的数据为后续任务使用,这些数据需要暂存或者长期存储并在需要时候能够被检索。利用对象存储的优势,以两种不同的模式,对科学工作流的输入数据、中间数据和输出数据予以布局和优化存储,为云计算环境下科学工作流中的数据管理提供参考。 In the era of big data, scientists need to establish experimental model for different workflows and data structure. Data from different environments and different Scientific Workflow Management System forms big scientific data, which challenges to Scientific Workfiow Management System. Scientific workflow is usually composed by several tasks that operate the input data and generate new data for subsequent tasks. The data need to be store for late access. The model of perform scientific workflow in cloud environment is presented, and taking advantage of object-based storage, two schemes of data management for scientific workflow are provided, which help to layout and optimize saving all files include input data, intermediate data and output data. Some recommendations for data management for scientific workflow in cloud is proposed.
出处 《图书馆学研究》 CSSCI 2015年第1期65-70,共6页 Research on Library Science
基金 国家自然科学基金资助项目"大数据环境下面向科学研究第四范式的信息资源云研究"(批准号:71373191)与国家自然科学基金资助项目"云计算环境下图书馆的信息服务等级协议研究"(批准号:71173163)的研究成果之一
关键词 云计算 对象存储 科学工作流管理系统 cloud computing object-based storage SWfMS (Scientific Workflow Management System)
  • 相关文献

参考文献22

  • 1Mattoso M, Werner C, Travassos G H, et al. Towards Supporting the Life Cycle of Large Scale Scientific Experiments [J]. International Journal of Business Process Integration and Management, 2010, 5 (]) : 19 -92.
  • 2Costa F, Silva V, de Oliveira D, et al. Capturing and Querying Workflow Runtime Provenance with Prov: A Practical Approach [ C]// Proceedings of the Joint EDBT/ICDT 2013 Workshops. ACM, 2013 : 282 -289.
  • 3Callahan S P, Freire J, Santos E, et al. VisTrails: Visualization Meets Data Management [C]//Proceedings of the 2006 ACM SIGMQD International Conference on Management of Data. ACM, 2006:745 -747.
  • 4de Oliveira D, Ogasawara E, Baiao F, et al. Scicumulus: A Lightweight Cloud Middleware to Explore Many Task Computing Paradigm in Scientific Workflows [ C]//Cloud Computing (CLOUD), 2010 IEEE 3rd International Conference on. IEEE, 2010:318 - 385.
  • 5de Oliveira D, Ocasa K A C S, Baiao F, et al. A Provenance-based Adaptive Scheduling Heuristic for Parallel Scientific Workflows in Clouds [J]. Journal of Grid Computing, 2012, 10 (3) : 521 -552.
  • 6"XSEDE-Extreme Science and Engineering Environment", 2012, Available: http: //www. xsede, org.
  • 7"Amazon Web Services", Available: http: //aws. amazon, com/.
  • 8Diaz J, Von Laszewski G, Wang F, et al. Futuregrid Image Repository: A Generic Catalog and Storage System for Heterogeneous Virtual Machine Images [C]//Cloud Computing Technology and Science (CloudCom), 2011 IEEE Third International Conference on. IEEE, 2011 : 560-564.
  • 9Vockler J S, Juve G, Deelman E, et al. Experiences Using Cloud Computing for a Scientific Workfiow Application [ C]//Proceedings of the 2nd international Workshop on Scientific Cloud Computing. ACM, 2011 : 15 -24.
  • 10Vahi K, Rynge M, Juve G, et al. Rethinking Data Management for Big Data Scientific Workflows [C]//Big Data, 2013 IEEE International Conference on. IEEE, 2013: 27 -35.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部