摘要
针对物流企业数据仓库扩展性不好、运行自动化程度不高、处理大规模数据效果较差等问题。本文通过对Hive技术的物流数据仓库进行分析,提出物流数据仓库的具体实现方案,该数据仓库结合云平台虚拟化技术,部署了Hadoop和Hive环境,搭建了基于虚拟化技术的大数据处理平台。同时从数据ETL和数据查询分析处理两方面对数据仓库的可扩展性Hive数据存储分析、Hive数据前置处理等进行研究设计。通过Hive数据仓库运行效果进行分析,表明该系统能够很好地支持企业管理层决策。
According to logistics enterprise data warehouse extensibility is bad, do not have a high level of automation operation and dealing with large-scale data effect is poor.This article through to the Hive logistics data warehouse technology is analyzed, put forward the concrete implementation of data warehouse logistics solution, the combination of cloud data warehouse platform virtualization technology, deployment of Hadoop and Hive environment, build a large data processing platform based on virtualization technology.At the same time from two aspects of data ETL and data query analysis and processing of the data warehouse extensibility Hive data storage analysis, Hive data pre-processing and so on carries on the research design.Through the Hive data warehouse operation effect is analyzed, indicates that the system is able to support management decisions.
出处
《电子设计工程》
2017年第9期31-35,共5页
Electronic Design Engineering
基金
河南省科技计划项目(9412014J0069)