期刊文献+

云环境下面向数据密集型应用的数据选择策略研究 被引量:1

Data Selection Strategy for Data-intensive Applications in Cloud
下载PDF
导出
摘要 云环境下独立任务包数据密集型应用已出现在多个领域。鉴于多数据中心环境和"按需付费"的资源使用模式,这类应用在数据选择方面面临着新的挑战,主要表现为如何从内容相同但位置和访问成本均不同的数据集中选择合适的数据资源作为应用的输入。针对该问题,首先构建云环境和数据选择问题模型。在此基础上,将成本最小化的数据选择过程抽象为带权重集合的覆盖问题,提出一种新的数据选择策略,以在执行效率和经济成本间取得平衡。实验结果显示,提出的数据选择策略在保证成本优化的同时兼顾了执行效率,综合性能良好。 Bag-of-Tasks data-intensive applications in cloud have been appeared in many fields.Considering the decentralized data centers and "pay-on-demand" model for resource usage,these applications now are facing new challenges in data selection.One of the problems is how to choose the appropriate data resources from multiple datasets which have the same content but the different locations and access costs.Firstly,cloud environment and data selection problem were modeled.Based on the model,a cost-minimized data selection process was Abstracted as a weighted set covering problem,and a new data selection strategy was proposed to make a tradeoff between execution efficiency and economic cost.Results of experiments show that the strategy takes into consideration both cost optimization and execution efficiency,and achieves a comprehensive performance.
出处 《计算机科学》 CSCD 北大核心 2012年第6期30-34,71,共6页 Computer Science
基金 国家自然科学基金(60703048) 南京大学计算机软件新技术国家重点实验室开放基金(KFKT2009B22) 武汉大学软件工程国家重点实验室开发基金(SKLSE20080720)资助
关键词 云计算 独立任务包 数据密集型应用 数据选择 带权重集合覆盖 Cloud computing Bag-of-Tasks Data-intensive application Data selection Weighted set covering
  • 相关文献

参考文献16

  • 1Chervenak A, Foster I, Kesselman C, et al. The data grid: To- wards an architecture for the distributed management and analy- sis of large scientific dalasets[-JT. Journal of Network and Com- puter Applications, 2001,23 (3) : 187-200.
  • 2李乔,郑啸.云计算研究现状综述[J].计算机科学,2011,38(4):32-37. 被引量:434
  • 3郑湃,崔立真,王海洋,徐猛.云计算环境下面向数据密集型应用的数据布局策略与方法[J].计算机学报,2010,33(8):1472-1480. 被引量:122
  • 4Venugopal S, Buyya R, Ramamohanarao K. A taxonomy of data grids for distributed data sharing, management and processing I-J]. ACM Computing Surveys, 2006,38 ( 1 ) : 1-53.
  • 5Vazhkudai S, Tuecke S, Foster I. Replica selection in the Globus data gridEC3//Proeeedings of the 1st International Symposium on Cluster Computing and the Grid. 2001 ]06-113.
  • 6Rahman R M, Barker K, Alhajj R. A predictive technique for replica selection in grid environment [C] ff Proceedings of the 7th International Symposium on Cluster Computing and the Grid. 2007 : 163-170.
  • 7Rahman R M, Alhajj R, Barker K. Replica selection strategies in data grid[J]. Journal of Parallel and Distributed Computing, 2008,68(12) : 1561-157,1,.
  • 8Venugopal S, Buyya R. An SCP-based heuristic approach for sche- duling distributed data-intensive application on global gridsl-J]. Journal of Parallel and Distributed Computing, 2008,68(4) : 471- 487.
  • 9Vazhkudai S. Enabling the co-allocation of grid data transfers [-C // Proceedings of the 4th International Workshop on Grid Computing. 2003 : 44-51.
  • 10Chang R S, Lin C F, Hsi S C. Accessing data from many servers simultaneously and adaptively in data grids[-J3. Future Genera- tion Computer System,2010,26(1):63-71.

二级参考文献64

  • 1Deelman E,Chervenak A.Data management challenges of data-intensive scientific workflows//Proceedings of the IEEE International Symposium on Cluster Computing and the Grid(CCGRID).Lyon,France,2008:687-692.
  • 2Deelman E,Blythe J,Gil Y,Kesselman C,Mehta G,Patil S,Su M H,Vahi K,Livny M.Pegasus:Mapping scientific workflows onto the grid//Proceedings of the European Across Grids Conference(AxGrids).Nicosia,Cyprus,2004:11-20.
  • 3Ludascher B,Altintas I,Berkley C,Higgins D,Jaeger E,Jones M,Lee E A.Scientific workflow management and the Kepler system.Concurrency and Computation:Practice and Experience,2005,18(10):1039-1065.
  • 4Oinn T,Addis M,Ferris J,Marvin D,Senger M,Greenwood M,Carver T,Glover K,Pocock M R,Wipat A,Li P.Taverna:A tool for the composition and enactment of bioinformatics workflows.Bioinformatics,2004,20(17):3045-3054.
  • 5Ghemawat S,Gobioff H,Leung S T.The google file system.ACM SIGOPS Operating Systems Review,2003,37(5):29-43.
  • 6Wang L,Tao J,Kunze M,Castellanos A C,Kramer D,Karl W.Scientific cloud computing:Early definition and experience//Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications(HPCC).Dalian,China,2008:825-830.
  • 7Wieczorek M,Prodan R,Fahringer T.Scheduling of scientific workflows in the ASKALON grid environment.SIGMOD Record,2005,34(3):56-62.
  • 8Baru C,Moore R,Rajasekar A,Wan M.The SDSC storage resource broker//Proceedings of the IBMCentre for Advanced Studies Conference.Toronto,Canada,1998:1-12.
  • 9Churches D,Gombas G,Harrison A,Maassen J,Robinson C,Shields M,Taylor I,Wang I.Programming scientific and distributed workflow with Triana services.Concurrency and Computation:Practice and Experience,2006,18:1021-1037.
  • 10Chervenak A,Deelman E,Foster I,Guy L,Hoschek W,Iamnitchi A,Kesselman C,Kunszt P,Ripeanu M,Schwartzkopf B,Stockinger H,Stockinger K,Tierney B.Giggle:A framework for constructing scalable replica location services//Proceedings of the ACM/IEEE Conference on Supercomputing.Baltimore,Maryland,USA,2002:1-17.

共引文献552

同被引文献14

引证文献1

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部