期刊文献+

一种用于大数据分析服务的启发式云爆发算法 被引量:1

A HEURISTIC CLOUD BURSTING ALGORITHM FOR BIG-DATA ANALYTICS SERVICES
下载PDF
导出
摘要 基于异构云联合的并行化大数据分析服务可以提升性能。然而由于大数据网络传输存在较大时延,原则上必须在并行化水平和大数据分析性能之间进行折衷。鉴于此,提出一种启发式云爆发算法用于并行化大数据分析服务。首先确定联合云中哪些计算结点应该用于大数据分析并行处理,然后将大数据妥善地分配给这些计算结点,确保处理同步完成且性能最优,最后,确定被分配的不同大小数据块在各个结点的计算次序,确保数据块传输尽量在结点上一数据块计算期间完成。与其他负载均衡算法做了对比,结果表明,使用该算法后性能可提升20%~60%。 Parallelisation big-data analytics services over a federation of heterogeneous clouds are considered to improve the performance. However, principally there is an inherent trade-off between the level of parallelisation and the performance of big-data analytics because a quite significant delay exists when the big-data is transmitted over the network. In view of this, we propose a heuristic cloud bursting algorithm and apply it to parallelisation big-data analytics services. First, the algorithm determines which computing nodes in federated clouds should be used for parallel processing of the big-data analytics ; then it appropriately allocates the big-data to these computing nodes for ensuring the completion of the synchronised processing with best performance; finally, it determines the computation sequence of the allocated big-data chunks with different sizes in each node, so as to guarantee the transmission of a data chunk is to be completed within the computation period of its previous chunk in the node as much as possible. We have compared our algorithm with other load-balancing schemes. Result shows that by using this algorithm the performance can be improved by 20% and up to 60% against other approaches.
出处 《计算机应用与软件》 CSCD 2015年第2期249-254,260,共7页 Computer Applications and Software
基金 河北省教育厅教学改革立项支持项目(103004) 教育部高职委项目(jzw590111050)
关键词 联合云 大数据分析 并行处理 云爆发 负载均衡 Federated clouds Big-data analytics Parallel processing Cloud bursting Load balancing
  • 相关文献

参考文献15

  • 1Howe D,Costanzo M,Fey P, et al. Big data: The future of biocuration [ J ]. Nature ,2008,455 (7209) :47 - 50.
  • 2Rozsnyai S, Slominski A, Doganata Y. Large-scale distributed storage system for business provenance [ C ]//Cloud Computing ( CLOUD ) , 2011 IEEE International Conference on. IEEE,2011:516 -524.
  • 3Ayres J, Flannick J, Gehrke J, et al. Sequential pattern mining using a bitmap representation [ C ]//Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. ACM ,2002:429 - 435.
  • 4Borthakur D. The hadoop distributed file system:Architecture and de- sign [ J ]. Hadoop Project Website,2007,11:21.
  • 5Mukherjee T, Banerjee A, Varsamopoulos G,et al. Spatio-temporal ther- mal-aware job scheduling to minimize energy consumption in virtualized heterogeneous data centers [ J ]. Computer Networks, 2009,53 ( 17 ) : 2888 - 2904.
  • 6Fan P, Wang J,Zheng Z, et al. Toward optimal deployment of communi- cation-intensive cloud applications [ C ]//Cloud Computing (CLOUD), 2011 IEEE International Conference on. IEEE ,2011:460 - 467.
  • 7Miyoshi T, Kise K, Irie H, et al. CODIE: Continuation-Based Overlap- ping Data-Transfers with Instruction Execution [ C ]//Networking and Computing (ICNC), 2010 First International Conference on. IEEE, 2010:71 -77.
  • 8Kim H, Parashar M. CometCloud: An Autonomic Cloud Engine [ J ]. Cloud Computing: Principles and Paradigms,2011:275 - 297.
  • 9Maheswaran M, Ali S, Siegal H J, et al. Dynamic matching and schedu- ling of a class of independent tasks onto heterogeneous computing sys- tems [ C ]//Heterogeneous Computing Workshop, 1999. ( HCW 99 ) Proceedings. Eighth. IEEE, 1999:30 - 44.
  • 10Kailasam S, Gnanasambandam N, Dharanipragada J, et al. Optimizing service level agreements for autonomic cloud bursting schedulers [ C ]// Parallel Processing Workshops (ICPPW), 2010 39th International Conference on. IEEE ,2010:285 - 294.

同被引文献5

引证文献1

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部