期刊文献+

基于Hadoop的产品大数据分布式存储优化 被引量:1

Storage Optimization of Product Big Data Based on Hadoop Platform
下载PDF
导出
摘要 研究产品相关大数据资源组织存储与检索查询技术,提出在Hadoop平台基础上对产品大数据资源进行分块存储。基于MapReduce并行架构模型,提出多副本一致性Hash数据存储算法,算法充分考虑了数据的相关性和时空属性,并优化了Hadoop平台的数据划分策略和数据块规格调整。通过对数据的优化存储布局,采用多源并行连接检索方法和多通道数据融合特征提取技术实现产品大数据信息检索,提高了数据资源管理效率。实验表明和标准Hadoop方案比较,多源并行连接数据检索的执行时间为其31.9%。 A blocking storage layout optimization method based on Hadoop was proposed. A multi-copy consistency hash algorithm based on data correlation and spatial and temporal properties was used. Data distribution strategy and block size adjustment were studied based on Hadoop. A multi-data source map join query algorithm and a multichannel data fusion feature extraction algorithm based on data-optimised storage were designed for the big data resources of products according to the MapReduce parallel framework. Practical verifications show that the execution time of multi-data source parallel retrieval was only 31.9% of the time of the standard Hadoop scheme.
出处 《计算机科学与应用》 2021年第5期1503-1511,共9页 Computer Science and Application
  • 相关文献

参考文献8

二级参考文献50

共引文献67

同被引文献8

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部