期刊文献+

基于Hadoop的数据仓库构建模式研究 被引量:7

Research on Construction Pattern of Hadoop Data Warehouse
下载PDF
导出
摘要 针对目前基于Hadoop的数据仓库一般采用"一对一"的模式或方法构建的情况,首先通过实例分析其不足之处;然后借鉴软件工程中的"生成器"设计模式的思想,提出一种Hadoop数据仓库的构建模式,称为"元数据驱动的生成器模式",用于构建基于Hadoop的数据仓库,即ETL过程。该模式具有两点优势:一是由元数据驱动,充分发挥了关系数据库管理系统对元数据操作的效率优势;二是识别了"通用知识"和"具体对象知识"两类知识,并在对知识的分类基础上设计和实现ETL过程,消除了"一对一"模式下大量不必要的重复操作。 The "case to case" pattern is a commonly used method for constructing Hadoop Hive data warehouse recently. Firstly, the "case to case" pattern was introduced and its disadvantage was shown with an example. Then inspired by the "Builder Pattern" which is one of design patterns in the area of software engineering, a pattern called "metadata-driven builder pattern" was put forward for constructing Hadoop Hive data warehouse and ETL process. This pattern has two advantages. One is that it is driven by the metadata and the metadata is operated by the relational database management (RDBMS). Doing so can achieve higher efficiency because the metadata of Hive is just stored in the RDBMS. The other one is that the "general knowledge" and "specific-object knowledge" are differentiated and the ETL process is designed and realized based on such differentiation. Doing so can avoid lots of repetitions that the "case to case" pattern leads to.
出处 《重庆理工大学学报(自然科学)》 CAS 2015年第7期69-73,共5页 Journal of Chongqing University of Technology:Natural Science
基金 湖北省教育厅自然科学研究项目(Q20141212)
关键词 云计算 大数据 数据仓库 HADOOP ETL cloud computing big data data warehouse Hadoop ETL
  • 相关文献

参考文献6

二级参考文献102

共引文献118

同被引文献63

引证文献7

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部