期刊文献+

Load Feedback-Based Resource Scheduling and Dynamic Migration-Based Data Locality for Virtual Hadoop Clusters in OpenStack-Based Clouds 被引量:4

Load Feedback-Based Resource Scheduling and Dynamic Migration-Based Data Locality for Virtual Hadoop Clusters in OpenStack-Based Clouds
原文传递
导出
摘要 With cloud computing technology becoming more mature, it is essential to combine the big data processing tool Hadoop with the Infrastructure as a Service(Iaa S) cloud platform. In this study, we first propose a new Dynamic Hadoop Cluster on Iaa S(DHCI) architecture, which includes four key modules: monitoring,scheduling, Virtual Machine(VM) management, and VM migration modules. The load of both physical hosts and VMs is collected by the monitoring module and can be used to design resource scheduling and data locality solutions. Second, we present a simple load feedback-based resource scheduling scheme. The resource allocation can be avoided on overburdened physical hosts or the strong scalability of virtual cluster can be achieved by fluctuating the number of VMs. To improve the flexibility, we adopt the separated deployment of the computation and storage VMs in the DHCI architecture, which negatively impacts the data locality. Third, we reuse the method of VM migration and propose a dynamic migration-based data locality scheme using parallel computing entropy. We migrate the computation nodes to different host(s) or rack(s) where the corresponding storage nodes are deployed to satisfy the requirement of data locality. We evaluate our solutions in a realistic scenario based on Open Stack.Substantial experimental results demonstrate the effectiveness of our solutions that contribute to balance the workload and performance improvement, even under heavy-loaded cloud system conditions. With cloud computing technology becoming more mature, it is essential to combine the big data processing tool Hadoop with the Infrastructure as a Service(Iaa S) cloud platform. In this study, we first propose a new Dynamic Hadoop Cluster on Iaa S(DHCI) architecture, which includes four key modules: monitoring,scheduling, Virtual Machine(VM) management, and VM migration modules. The load of both physical hosts and VMs is collected by the monitoring module and can be used to design resource scheduling and data locality solutions. Second, we present a simple load feedback-based resource scheduling scheme. The resource allocation can be avoided on overburdened physical hosts or the strong scalability of virtual cluster can be achieved by fluctuating the number of VMs. To improve the flexibility, we adopt the separated deployment of the computation and storage VMs in the DHCI architecture, which negatively impacts the data locality. Third, we reuse the method of VM migration and propose a dynamic migration-based data locality scheme using parallel computing entropy. We migrate the computation nodes to different host(s) or rack(s) where the corresponding storage nodes are deployed to satisfy the requirement of data locality. We evaluate our solutions in a realistic scenario based on Open Stack.Substantial experimental results demonstrate the effectiveness of our solutions that contribute to balance the workload and performance improvement, even under heavy-loaded cloud system conditions.
出处 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2017年第2期149-159,共11页 清华大学学报(自然科学版(英文版)
基金 supported by the Open Project Program of Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks(No.WSNLBKF201503) the Fundamental Research Funds for the Central Universities(No.2016JBM011) Fundamental Research Funds for the Central Universities(No.2014ZD03-03) the Priority Academic Program Development of Jiangsu Higher Education Institutions Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology
关键词 Hadoop resource scheduling data locality Infrastructure as a Service(Iaas) OpenStack Hadoop resource scheduling data locality Infrastructure as a Service(Iaas) OpenStack
  • 相关文献

参考文献3

二级参考文献39

  • 1losup A, Jan M, Sonmez O, Epema DHJ. On the dynamic resource availability in grids. In: Proc. of the 8th IEEE/ACM Int'l Conf. on Grid Computing (Grid 2007). Texas: 1EEE Computer Society, 2007.26-33. [doi: 10.1109]GRID.2009.4354112].
  • 2Khalili O, He J, Olsehanowsky C, Snavely A, Casanova H. Measuring the performance and reliability of production computational grids. In: Proc. of the 7th IEEE/ACM lnt'l Conf. on Grid Computing (Grid 2006). Barcelona: IEEE Computer Society, 2006. 293-300. [doi: 10.1109/ICGRID.2006.311028].
  • 3Xu M, Cui LZ, Wang HY, Bi YB. A multiple QoS constrained scheduling strategy of multiple workflows for cloud computing. In: Proc. of the 2009 IEEE lnt'l Symp. on Parallel and Distributed Processing with Applications. 2009. 629-634. [doi: 10.1109/ISPA. 2009.95].
  • 4Chen K, Zheng WM. Cloud computing: System instances and current research. Ruan Jian Xue Bao/Journal of Software, 2009,20(5) 1337-1345 (in Chinese with English abstract), http://www.jos.org.cn/1000-9825/3493.html [doi: 10.3724/SP.J.1001.2013.03493].
  • 5Tian WH, Zhao Y. Cloud Computing: Resource Scheduling Management. Beijing: National Defence Industry Publishing House, 2011 (in Chinese).
  • 6Figueiredo R. Adaptive predictor integration for system performance prediction. In: Proc. of the IEEE Int'l Parallel and Distributed Processing Symp. IEEE Press, 2007. [doi: 10.1109/IPDPS.2007.370277].
  • 7Diaz I, Fernandez G, Martinm M. Integrating the common information model with MDS4. In: Proc. of the 9th IEEE/ACM lnt'l Conf. on Grid Computing. 2008. [doi: 10.1109/GRID.2008.4662812].
  • 8losup A, Sonmez O, Epema D. The characteristics and performance of groups of jobs in grids. Lecture Notes in Computer Science, 2007,46(41):382-393. [doi: 10.1007/978-3-540-74466-5_42].
  • 9Bucur AID, Epema DHJ. Scheduling policies for processor collocation in multicluster system. IEEE Trans. on Parallel and Distributed Systems, 2007,18(7):958-962. [doi: 10.1109/TPDS.2007.1036].
  • 10Fu S, Xu CZ. Exploring event correlation for failure prediction in coalitions of clusters. In: Proc. of the 2007 ACM/IEEE Conf. on Super Computing (SC 2007). Nevada: IEEE Computer Society, 2007.41-52. [doi: 10.1145/1362622.1362678].

共引文献31

同被引文献15

引证文献4

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部