摘要
云计算技术的快速发展为海量数据的存储和管理提供了可能.然而,由于存储模型的根本改变,传统关系数据库管理系统中成熟的索引技术既不能直接应用于海量数据的处理,也无法被简单地迁移到云计算环境中.通过分析对比辅助索引在云环境中的两种截然不同的基本逻辑结构,即集中式方案与分布式方案,在吸收两者的优势并规避其弱点的基础上,提出了具有良好可扩展性的分片位图索引机制,从而对云环境中海量数据的检索任务提供高效的支持.通过充分利用云环境中的并行计算资源,使单条查询的响应速度得到提升;与此同时,局部节点根据其所掌握的全局信息规避了不必要的检索开销从而使大量请求并发到达时的查询吞吐量得以保证.在真实数据上进行实验的结果表明,分片位图索引的查询性能大大优于其它方法.
The fast development of Cloud Computing technologies has brought new dawns to the storage and management of massive data. Nevertheless, due to the essential changes in the storage model, the matured indexing techniques used in traditional relational data management systems can neither be directly applied to massive data, nor be migrated to Cloud environment in an easy way. Based on comparisons between two basic approaches to secondary indexing, i.e. centralized and distributed approaches, the Regional Bitmap Index (RBI) is proposed to combine the advantages of both approaches and provide efficient supports to various queries against massive data in the Cloud. By means of fully utilizing the parallel computing resources provided by the Cloud, the query efficiency is dramatically improved. Meanwhile, based on global distribution information, RBI can avoid the unnecessary computing expenses on local nodes; therefore query throughputs can keep steady even if concurrency of the incoming queries increases. Experiments on real dataset show that the Regional Bitmap Index can significantly outperform other methods.
出处
《计算机学报》
EI
CSCD
北大核心
2012年第11期2306-2316,共11页
Chinese Journal of Computers
基金
国家"八六三"高技术研究发展计划项目基金(2012AA011002
2011AA010706)
核高基重大专项(2010ZX01042-002-002-02
2010ZX01042-001-003-05)
国家自然科学基金(60973002
61170003
61073018)
深港创新圈项目(JSE201007160004A)资助~~