摘要
在大数据时代,数据高速增长,对数据仓库管理方法和技术提出了全新挑战,为实现仓储资源优化配置,提高资源使用效率,首次把数据半衰期运用于数据仓库分级存储。传统固定阈值转存策略存在存储资源配置不合理的问题,利用半衰期分级存储策略,对每个数据对象进行计算分析后转存,采用MPP数据仓库和Ha?doop构建混合数据仓库存储架构,解决了大数据背景下的数据存储与分析,实现了数据仓库管理方法和数据存储架构的创新。实际验证发现,数据半衰期转存策略优于固定阈值转存策略,证明数据半衰期在数据仓库的管理中有显著应用价值。
In the era of big data,the rapid growth of data has brought new challenges to data warehouse management methods and technologies.This paper applies data half-life to hierarchical storage of data warehouses for the first time.The purpose is to optimize the configuration of storage resources and improve the efficiency of resource use.The traditional fixed-threshold save strategy has the shortcoming of unreasonable allocation of storage resources.A half-life storage strategy is used to calculate,analyze and transfer each data object.In terms of technology,MPP data warehouse and Hadoop are used to build hybrid data warehouse storage.The method solves the problem of data storage and facilitates analysis under the background of big data,and realizes the data warehouse management and data storage architecture innovation.The method of verifying the half-life of data by empirical method is better than that of the fixed threshold,which proves that the data half-life has significant application value in the data warehouse management.
作者
曾广移
卢勇
李德华
李俊超
ZENG Guang-wei;LU Yong;LI De-hua;LI Jun-chao(Southern Power Grid Peak Modulation FM Power Generation Co.,Ltd;Southern Power Grid Science Research Institute,Guangzhou 510623,China)
出处
《软件导刊》
2019年第2期123-127,131,共6页
Software Guide