期刊文献+

基于数据关联性聚类的数据布局算法 被引量:2

Data placement algorithm based on data dependence
下载PDF
导出
摘要 现代信息系统的突出特征是基于海量数据的分布式应用集群。优化海量数据的存储布局,以提升存储资源的利用率和应用执行的速度,是一个重要研究课题。由于数据与数据之间存在关联性,只考虑负载均衡的布局算法缺乏实用性,需要进一步考虑数据与数据的关联性以提高应用执行速度。建立了数据和数据的关联矩阵,基于关联矩阵进行聚类,再将数据分配到各个数据中心中,计算执行应用时的数据迁移量,并与一致hash算法进行了比较,结果表明数据迁移量大大低于一致hash算法。 The prominent feature of the modern information systems is the distributed applications clustering based on massive data, so optimizing the storage of mass data to improve the response time of the application service while making full use of the storage resources is an important task. Because of the dependency of the data, the data placement algorithm which simply considers the load balancing is lack of practicability, so it needs to further consider the denpendency of the data to improve the response time of the application service. So it establishes a denpendency matrix, clusters the data based on the dependency matrix, and then distributes the data to each data center. It analyses data movements of the appli-cation and compared with the consistent hashing, the results show that the data movements is greatly decreased.
作者 董微 闻育
出处 《计算机工程与应用》 CSCD 2014年第3期117-120,共4页 Computer Engineering and Applications
关键词 数据布局 聚类 一致hash 数据关联性 data placement clustering consistent hash data dependence
  • 相关文献

参考文献11

二级参考文献34

  • 1刘仲,周兴铭.基于动态区间映射的数据对象布局算法[J].软件学报,2005,16(11):1886-1893. 被引量:16
  • 2谈华芳,孙丽丽,侯紫峰.大规模存储中的一个有效的数据放置算法[J].计算机工程,2006,32(10):47-49. 被引量:4
  • 3王迪,舒继武,薛巍,沈美明.基于块级别的SAN系统自适应分级存储[J].高技术通讯,2007,17(2):111-115. 被引量:8
  • 4Honicky R J, Miller E L. A Fast Algorithm for Online Placement and Reorganization of Replicated Data[C]//Proc of IPDPS'03, 2003.
  • 5Xin Q, Miller E L, Long D D E, et al. Reliability Mechanisms for Very Large Storage Systems[C]// Proc of MSST'03, 2003:146-156.
  • 6Brinkmann A, Effert S, Heide F M A D, et al. Dynamic and Redundant Data Placement[C]//Proc of ICDCS'07, 2007:29.
  • 7Karger D, Lehman E, Leighton T, et al. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web[C]/// Proc of STOC'97, 1997:654-663.
  • 8Brinkmann A, Salzwedel K, Scheideler C. Efficient, Distributed Data Placement Strategies for Storage Area Networks [C]//Proc of SPAA'00, 2000:119-128.
  • 9Brinkmann A, Salzwedel K, Scheideler C. Compact, Adaptive Placement Schemes for Non-uniform Distribution Requirements[C]// Proc of SPAA'02, 2002:53-62.
  • 10Schindelhauer C, Sehomaker G. Weighted Distributed Hash Tables[C]//Proc of SPAA'05, 2005 : 218-227.

共引文献50

同被引文献8

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部