期刊文献+

面向海量水利数据的索引方法研究 被引量:2

Research on Index Method of Massive Hydrology Data
下载PDF
导出
摘要 水利数据的存储形式多样、数据量庞大以及水利实体种类丰富,针对每一类水利实体对象,不仅存在基础描述信息,也存在一系列测量业务信息,这2类数据存储和更新频率不相同。水利业务检索不仅要求能实现对象基础信息的快速检索,还要求根据基础描述信息和业务信息之间的依赖进行组合查询,而目前云环境中,尚缺能满足此类兼顾多类型数据之间依赖关系的高效索引方法。此外,水利数据量的急剧增长,给系统检索性能带来了巨大的挑战。为此,本文提出基于Hadoop的分布式双层索引结构HRB,针对不同的数据类型建立不同的索引。经实验验证,HRB索引与常规分布式索引相比,索引创建效率更优,且在数据量达到千万级别时,HRB检索速度更快,表明HRB具有一定的使用价值。 A large amount of hydrology data are stored in different forms and there are rich varieties of hydrology entity classes.For every type of hydrology entities,some basic description information and series of measuring business data involved in these entities are stored in different way with different update frequency. Hydrology business retrieve requests the index to provide basic descriptive information searching and a kind of combined query based on the relation between basic descriptive information and the business information. However,there is not an efficient index method which can consider several kinds of data and their dependencies. Furthermore,the rapid increasing of hydrology data also brings big challenges to retrieval performance. So,this paper proposes a distributed two-level index HRB based on Hadoop,which creates different index to satisfy different data types and retrieve requirements. The Experiments show that HRB is better at creating index than traditional distributed index,and when the amount of data reaches 10 million levels,HRB index retrieve data is faster. So,HRB has definitive value.
出处 《计算机与现代化》 2017年第10期29-35,41,共8页 Computer and Modernization
基金 国家自然科学基金资助项目(61370091 61602151)
关键词 水利实体 双层索引结构 分布式索引 HADOOP hydrology entities, two-level index, distributed index, Hadoop
  • 相关文献

参考文献5

二级参考文献31

  • 1Armbrust Michael, Fox Armando, Griffith Rean et al. A view of cloud computing. Communications of the ACM, 2010, 53(4): 50-58.
  • 2Yang H-C, Dasdan A, Hsiao R L, Parker D S. Map-reduce merge: Simplified relational data processing on large clus- ters//Proceedings of the SIGMOD 2007. Beijing, China, 2007:1029-1040.
  • 3Chowdhury N M Mosharaf Kabir, Boutaba Raouf. A survey of network virtualization. Computer Networks, 2010, 54 (5) : 862-876.
  • 4Seshadri P, Pirahesh H, Leung T Y C. Complex query decorrelation//Proceedings of the ICDE. New Orleans, LA, 1996 : 450-458.
  • 5Canahuate Guadalupe, Apaydin Tan, Sacan Ahmet, Ferha- tosmanoglu Hakan. Secondary bitmap indexes with vertical and horizontal partitioning//Proeeedings of the EDBT. Saint Petersburg, Russia, 2009:600-611.
  • 6Sadoghi Mohammad, Jacobsen Hans-Arno. Be-tree: An in- dex structure to efficiently match boolean expressions over high-dimensional discrete spaee//Proceedings of the S1G- MOD Conference. Athens, Greece, 2011:637-648.
  • 7Chang Fay, Dean Jerey, Ghemawat Sanjay et al. Bigtable: A distributed storage system for structured data//Proceedings of the OSDI. Seattle, Washington, USA, 2006:205-218.
  • 8Apache HBase Project. http: //hbase. apache, org/.
  • 9HBase Transactional Index. https: //github. eom/hbase- trx/hbase-transactional-tableindexed.
  • 10Aguilera Marcos Kawazoe, Golab Wojciech M, Shah Mehul A. A practical scalable distributed B-tree//Proceedings of the VLDB. Auckland, New Zealand, 2008:598-609.

共引文献200

同被引文献19

引证文献2

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部