期刊文献+

数据ETL研究综述 被引量:106

Overview of Data Extraction,Transformation and Loading
下载PDF
导出
摘要 数据抽取、转换和装载(Extraction,Transformation and Loading,简称ETL)是数据仓库化的关键环节,对数据仓库数据质量有着至关重要的影响。随着信息化的发展,ETL已经成为当前较活跃的研究领域之一,但是ETL理论和技术的发展还不成熟。针对当前ETL研究中存在的一些问题和需要考虑的各种因素,从ETL各个阶段存在的主要问题出发,列举了各种研究方法及研究成果,并进行了分析。最后,总结并提出了ETL的未来研究方向和今后工作的建议。 Data extraction,transformation and loading are crucial steps of data warehousing,which influences data qua-lity of data warehouse intensively.With the development of informationization,ETL has already become one of most popular research fields,but till now,ETL theory and technology are still not mature.As to the problems and factors appeared in ETL research,many research methods and achievements were listed according to the main problems existed in each ETL phase.Finally,several future research trends of ETL and some proposals for the future research work were summarized and presented respectively.
作者 徐俊刚 裴莹
出处 《计算机科学》 CSCD 北大核心 2011年第4期15-20,共6页 Computer Science
基金 国家863计划课题"数据中心的运行时功耗管理技术"(2009AA01Z139)资助
关键词 ETL 数据仓库 数据质量 元数据 ETL Data warehouse Data quality Metadata
  • 相关文献

参考文献49

  • 1Vassiliadis P, Simitsis A, Skiadopoulos S. Conceptual Modeling for ETI. Processes [C].//Proceedings of the 5th ACM International Workshop on Data Warehousing and OLAP. New York.. ACM, 2002 : 14-21.
  • 2Simitsis A. Mapping Conceptual to Logical Models for ETL Processes[C] .// Proceedings of the 8th ACM International Workshop on Data Warehousing and OI.AP. New York: ACM, 2005: 67-76.
  • 3郭志懋,周傲英.数据质量和数据清洗研究综述[J].软件学报,2002,13(11):2076-2082. 被引量:268
  • 4Inmon W H. The Data Warehouse Budget [J/OL]. DM Review Magazine. http://www, datawarehouse, inf. br/Papers/inmon% 20budget 1. pdf, 2010-4-12.
  • 5Shilakes C,Tylman J. Enterprise Information Portals [R]. New York:Merrill Lynch, 1998.
  • 6Demare.st M. The Politics of Data Warehousing [EB/OL]. http://www, hevanet, com/demarest/marc/dwpol, html, 2009 6- 12.
  • 7Simitsis A, Vassiliadis P. A Methodology for the Conceptual Modeling of ETI. Processes [C] // Proceedings of the Decision Systems Engineering Workshop. Klagenfurt: CAiSE, 2003 : 501- 505.
  • 8SkoutasD. Designing ETL Processes Using Semantic Web Technologies [C]//Proceedings of the 9th ACM International Workshop on DataWarehousing and OLAP. NewYork:ACM, 2006:67-74.
  • 9Sellis T. Formal Specification and Optimization of ETL Scenarios [C] // Proceedings of the 9th ACM International Workshop on Data Warehousing and OLAP. New York: ACM, 2006:1-2.
  • 10Doorenbos R, Etzioni O, Weld D. A Scalable Cornparison-shopping Agent for the World Wide Web [C].//Proceedings of theFirst International Conference on Autonomous Agents. New York: ACM, 1997 :39-48.

二级参考文献66

  • 1[1]Bitton D, DeWitt D J. Duplicate record elimination in large data files. ACM Trans Database Systems, 1983, 8(2):255-65
  • 2[2]Hernandez M, Stolfo S. The Merge/Purge problem for large databases. In: Proc ACM SIGMOD International Conference on Management of Data, 1995. 127-138
  • 3[3]Howard B Newcombe, Kennedy J M, Axford S J, James A P. Automatic linkage of vital records. Science, 1959, 130:954-959
  • 4[4]DeWitt D J, Naught J F, Schneider D A. An evaluation of non-equijoin algorithms. In: Proc 17th International Conference on Very Large Databases, Barcelona, Spain, 1991. 443-452
  • 5[5]Hylton J A. Identifying and merging related bibliographic records[MS dissertation]. MIT: MIT Laboratory for Computer Science Technical Report 678, 1996
  • 6[6]Monge A E, Elkan C P. An efficient domain-independent algorithm for detecting approximately duplicate database records. In: Proc DMKD'97, Tucson Arizona, 1997
  • 7[7]Kukich K. Techniques for automatically correcting words in text. ACM Computing Surveys, 1992, 24(4):377-439
  • 8[8]Wagner R A, Fischer M J. The string-to-string correction problem. J ACM, 1974, 21(1):168-173
  • 9[9]Lowrance R, Robert A Wagner. An extension of the string-to-string correction problem. J ACM, 1975, 22(2):177-183
  • 10[10] Sellers P H. On the theory and computation of evolutionary distances. SIAM J Applied Mathematics, 1974, 26(4):787-793

共引文献352

同被引文献711

引证文献106

二级引证文献466

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部