摘要
数据量大、数据多维是水利普查数据的重要特征。根据水利普查决策分析的需要,在对数据立方体技术研究的基础上,基于部分物化策略,提出了建立层次维编码片段立方体(HDEFC)。利用维度属性的概念分层特性,在层次维片段中采用混合索引(B-tree和Bit Code)技术对每个层次维的层次属性进行二进制编码,再利用生成的维度编码代替原表中关键字,非层次维片段中采用倒排索引技术对每个片段子立方体进行物化,减少了多表连接操作,从而提高OLAP查询效率。实验结果表明,生成的HDEFC占用较小的存储空间,查询方法在面对高维的复杂查询时具有优势。通过建立水利普查数据分析系统,说明了该方法能够有效地解决因数据量庞大、维度多导致的数据计算和查询效率低下等问题,降低了物化水利普查成果数据立方体的时间和空间成本。
Large amount of and multidimensional data is an important feature of water census data.According to the need of water census decision analysis,on the basis of data cube technology and partial materialization strategy,the establishment of Hierarchical Dimension Encoding Fragment Cube( HDEFC) is put forward.By the concept hierarchy characteristics of the dimension attribute,hybrid index( B-tree and Bit Code) technology is used to execute binary coding for hierarchy properties of each dimension,and the generated dimension code is applied to replace the key in the original table.In addition,non hierarchical dimension fragment uses inverted index technology to materialize each sub cube,so as to reduce the multi table join operation and improve OLAP query efficiency. Experiments showthat the generated HDEFC occupies less storage space,and the query method has advantages in the face of high dimensional complex query.Through the establishment of water census data analysis system showthat the method can effectively solve the problem of lowefficiency of data calculation and query because of the huge amount of and multi-dimensional data,which reduces the cost of time and space of the material of water census results data cube.
出处
《计算机技术与发展》
2017年第2期134-138,共5页
Computer Technology and Development
基金
国家科技支撑计划课题(2015BAB07B01)
水利部公益性行业科研专项(201501022)
关键词
水利普查
数据多维
数据立方体
数据分析系统
层次维编码片段
water census
multi-dimensional data
data cube
data analysis system
hierarchical dimension encoding fragment