期刊文献+

分布式环境下ETL系统的优化策略研究 被引量:1

Research on ETL Scheduling Model in Distributed System
下载PDF
导出
摘要 ETL是将数据由不同数据源抽取到数据仓库的重要过程,ETL的过程设计、维护和修改直接影响数据仓库中数据处理的效率和数据的质量。通过分析ETL活动的模型特点,结合分布式计算的思想提出一种新的ETL系统模型,并提出基于该系统架构的满足ETL任务形态特征的优化方案,详细描述数据以及调度信息在系统中的周转过程。 ETL is an important process of extracting data from different data sources to Data Warehouse.Its process design,maintenance and modification directly affect the efficiency of data processing and data quality in the data warehouse.Combined with the concept of distributed computing,presents a new ETL System model,and furthermore puts forward an optimizing method that is based on the system architecture and satisfies the topological characteristics of ETL tasks,describes the data flow and scheduling process of the system in details.
出处 《现代计算机(中旬刊)》 2016年第8期39-42,80,共5页 Modern Computer
关键词 数据仓库 分布式系统 抽取转换加载(ETL) Data Warehouse Distributed System Extract-Transform-Load(ETL)
  • 相关文献

参考文献7

  • 1徐俊刚,裴莹.数据ETL研究综述[J].计算机科学,2011,38(4):15-20. 被引量:106
  • 2许力,牟晓光,马云存.并行ETL过程的研究与实现[J].计算机工程与应用,2009,45(13):170-172. 被引量:15
  • 3尤玉林,张宪民.一种可靠的数据仓库中ETL策略与架构设计[J].计算机工程与应用,2005,41(10):172-174. 被引量:46
  • 4Simitsis A, Vassiliadis P, Sellis T. Optimizing ETL Processes in Data Warehouses:Proceedings of 21st International Conference on Data Engineering (ICDE), 2005[C]. Tokyo, Japan, 2005: 564- 575.
  • 5Vassiliadis P, Simitsis A, Terrovitis M. A Framework for the Design ETL Scenarios:Proceedings of the 15th Conference on Advanced Information Systems Engineering (CAISE2003), 2003[C]. Klagenfurt,Austria,2003:520-535.
  • 6Vassiliadis P,Vagena Z, Skiadopoulos S, et al. Towards the Modeling,Design, Control and Execution of ETL Processes[J]. Information Systems, 2001,26(8 ):537-561.
  • 7Vassiliadis P,Simitsis A,Skiadopoulos S. Conceptual Modeling for ETL Processes:Proceeding of the 5th ACM International Workshop on Data Warehousing and OLAP,2002[C],2002:14- 21.

二级参考文献59

  • 1鲍玉斌,孙焕良,冷芳玲,王大玲,于戈.数据仓库环境下以用户为中心的数据清洗过程模型[J].计算机科学,2004,31(5):52-55. 被引量:15
  • 2钟华,冯文澜,谭红星,黄涛.面向数据集成的ETL系统设计与实现[J].计算机科学,2004,31(9):87-89. 被引量:21
  • 3沈军,满家巨,聂作先.高性能集群管理与优化[J].计算机与现代化,2007(2):84-88. 被引量:7
  • 4Inmon W H.Building the data warehouse[M].New York:Wiley &Sons,1993.
  • 5Giacomoni J,Moseley T,Vachharajani M,Fast forward for efficient pipeline parallehsm,Technical Report CU-CS-1028-07[R].University of Colorado at Boulder:2007-04.
  • 6Vassiliadis P, Simitsis A, Skiadopoulos S. Conceptual Modeling for ETI. Processes [C].//Proceedings of the 5th ACM International Workshop on Data Warehousing and OLAP. New York.. ACM, 2002 : 14-21.
  • 7Simitsis A. Mapping Conceptual to Logical Models for ETL Processes[C] .// Proceedings of the 8th ACM International Workshop on Data Warehousing and OI.AP. New York: ACM, 2005: 67-76.
  • 8Inmon W H. The Data Warehouse Budget [J/OL]. DM Review Magazine. http://www, datawarehouse, inf. br/Papers/inmon% 20budget 1. pdf, 2010-4-12.
  • 9Shilakes C,Tylman J. Enterprise Information Portals [R]. New York:Merrill Lynch, 1998.
  • 10Demare.st M. The Politics of Data Warehousing [EB/OL]. http://www, hevanet, com/demarest/marc/dwpol, html, 2009 6- 12.

共引文献158

同被引文献7

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部