摘要
ETL工具是构建和维护数据仓库的基本构件,由于它处理的是海量数据,如何有效地加快响应时间成为值得研究的问题.本文提出了ETL过程的“主表衍生”模式,并针对这种模式采用流水线算法来提高并行性从而加快ETL过程的响应时间,理论分析和实验表明具有好的效果.
ETL is a tool responsible for data loading and maintaining of data warehouse.How to efficiently shorten the execution time is a big challenge because the volume of data to be processed is very large.This paper discusses the model of 'derived by one view' for ETL execution and gives the pipelining method by view horizontal partition to shorten the exectutin time.The theory and experiment proves it to be efficient.
出处
《小型微型计算机系统》
CSCD
北大核心
2005年第6期1013-1017,共5页
Journal of Chinese Computer Systems
基金
江苏省十五高科技项目(BG2001013)资助.