摘要
OLAP(online analytical processing,在线联机分析处理)是关系数据基础上实现商业智能的核心技术。在大数据时代,人们迫切希望在由普通机器组成的大规模集群上能实现高性能的OLAP,然而系统性能的挑战巨大。可喜的是,近年来进展迅速,涌现了很多以Hadoop上的数据进行OLAP的所谓SQL on Hadoop系统,并且系统性能不断提升。在综述OLAP技术发展的基础上,重点对几个有代表性的SQL on Hadoop系统进行了测试分析,并展示了这类系统的性能特点。可以预见,未来在低成本的大数据OLAP市场,这类系统会占有重要位置。
OLAP (online analytical processing) is a key technology of business intelligence based on relational data. In big data era, people want to achieve high performance OLAP using a large cluster of ordinary nodes. However, the performance of such systems is a big challenge. Recently, many SQL on Hadoop systems have been proposed to address this challenge. We have seen a significant performance improvement of such systems. A survey of technology development of OLAP technologies was first provided. Then, a study of the performance of three representatives SQL on Hadoop systems was focused on. Based on the results, it is expected that such systems will play an very important role in the market of low cost OLAP analysis.
出处
《大数据》
2015年第1期48-60,共13页
Big Data Research
基金
国家自然科学基金面上项目"高度可扩展的数据仓库数据编码方法及查询处理新技术研究"(No.61170013)
中国人民大学科学研究基金(中央高校基本科研业务费专项资金)资助项目(No.14XNLQ06)
国家社会科学基金重大项目"云计算环境下的信息资源集成与服务研究"(No.12&ZD220)~~