摘要
云环境下独立任务包数据密集型应用已出现在多个领域。鉴于多数据中心环境和"按需付费"的资源使用模式,这类应用在数据选择方面面临着新的挑战,主要表现为如何从内容相同但位置和访问成本均不同的数据集中选择合适的数据资源作为应用的输入。针对该问题,首先构建云环境和数据选择问题模型。在此基础上,将成本最小化的数据选择过程抽象为带权重集合的覆盖问题,提出一种新的数据选择策略,以在执行效率和经济成本间取得平衡。实验结果显示,提出的数据选择策略在保证成本优化的同时兼顾了执行效率,综合性能良好。
Bag-of-Tasks data-intensive applications in cloud have been appeared in many fields.Considering the decentralized data centers and "pay-on-demand" model for resource usage,these applications now are facing new challenges in data selection.One of the problems is how to choose the appropriate data resources from multiple datasets which have the same content but the different locations and access costs.Firstly,cloud environment and data selection problem were modeled.Based on the model,a cost-minimized data selection process was Abstracted as a weighted set covering problem,and a new data selection strategy was proposed to make a tradeoff between execution efficiency and economic cost.Results of experiments show that the strategy takes into consideration both cost optimization and execution efficiency,and achieves a comprehensive performance.
出处
《计算机科学》
CSCD
北大核心
2012年第6期30-34,71,共6页
Computer Science
基金
国家自然科学基金(60703048)
南京大学计算机软件新技术国家重点实验室开放基金(KFKT2009B22)
武汉大学软件工程国家重点实验室开发基金(SKLSE20080720)资助