期刊文献+

云计算环境下支持复杂查询的多维数据索引机制 被引量:14

A Multi-Dimensional Indexing for Complex Query in Cloud Computing
下载PDF
导出
摘要 针对云计算环境下分布式存储系统的数据索引不支持复杂查询的问题,提出了一种多维数据索引机制M-Index,采用金字塔技术(pyramid-technique)将数据的多维元数据描述成一维索引,在此基础上首次提出前缀二叉树(prefix binary tree,PBT)的概念,通过提取一维索引和PBT有效节点的前缀作为数据在存储系统中的主键.数据根据主键和一致性Hash机制发布到存储节点组成的覆盖网络.设计了基于M-Index的数据查询算法,将复杂查询请求转换成一维查询键值,有效支持多维查询和区间查询等复杂查询模式.理论分析和实验表明,M-Index在复杂查询模式下具有良好的查询效率和负载均衡. Data indexing is one of the most important techniques for distributed storage systems in cloud computing environments since the application data has been partitioned among different storage nodes of the data center. With the rapid development of Web applications, most query requests about metadata information are more complicated. However, the state-of-the-art indexing mechanisms for distributed storage system cannot support complex query, such as multi-dimensional query and range query. To address this issue, we firstly construct the definition of prefix binary tree (PBT) in this paper to support range query process. We then investigate a multi-dimensional indexing for complex query in cloud computing (M-Index) by the combination of pyramid-technique and PBT to transform the multi-dimensional metadata into a single-dimensional key. Data are distributed to overlay networks based on the key and consistent hashing to implement the efficient acquisition and distribution of data. On this basis, we propose a query algorithm based on M-Index which will support multi-dimensional query and range query. Last but not the least, theoretic analysis proves that M-Index possesses fine complex query efficiency as well as completeness of query results. And furthermore, the experiment results demonstrate that our indexing mechanism can outperform the existing relevant mechanisms in query efficiency and load balancing.
出处 《计算机研究与发展》 EI CSCD 北大核心 2013年第8期1592-1603,共12页 Journal of Computer Research and Development
基金 国家"九七三"重点基础研究发展计划基金项目(2010CB328104) 国家自然科学基金项目(61070161 61202449 61272054 61003257) 国家"八六三"高技术研究发展计划基金项目(2013AA013503) 国家科技支撑计划基金项目(2010BAI88B03 2011BAK21B02) 高等学校博士学科点专项科研基金项目(20110092130002) 国家科技重大专项科研基金项目(2010ZX01044-001-001) 江苏省自然科学基金项目(BK2008030) 江苏省产学研前瞻性联合研究项目(BY2012202) 江苏省科技成果转化专项资金项目(BA2012036) 江苏省网络与信息安全重点实验室资助项目(BM2003201) 教育部计算机网络与信息集成重点实验室(东南大学)资助项目(93K-9) 上海市可扩展计算与系统重点实验室(上海交通大学)资助项目(2010DS680095) 中国教育科研网格ChinaGrid资助项目
关键词 云计算 数据索引 多维查询 区间查询 一致性Hash cloud computing data indexing multi-dimensional query range query consistent Hashing
  • 相关文献

参考文献17

  • 1Mell P. Grance T. The NIST definition of cloud computing, SP800-145 [R].Gaithersburg: National Institute of Standards and Technology, 2011.
  • 2Stonebraker M. The case for shared nothing [J]. IEEE Database Engineering Bulletin, 1986, 9(1): 4-9.
  • 3王鹏,孟丹,詹剑锋,涂碧波.数据密集型计算编程模型研究进展[J].计算机研究与发展,2010,47(11):1993-2002. 被引量:39
  • 4Ghemawat S, Gobioff H, Leung S. The google file system [C]//proc of the 19th ACM Symp on Operating Systems Principles. New York: ACM, 2003: 29-43.
  • 5Apache. Hadoop [EB/OL]. [2011-12-20]. http: //hadoop. apache. org/.
  • 6DeCandia G, Hastorun D, Jampani M, et al. Dynamo: Amazon's highly available key-value store [C]//Proc of the 21st ACM Syrnp on Operating Systems Principles. New York: ACM, 2007: 205-220.
  • 7Lakshman A, Malik P. Cassandra: A decentralized structured storage system [J]. ACM SIGOPS Operating Systems Review, 2010, 44(2): 35-40.
  • 8Karger D, Lehman E, Leighton T, et al. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web [C]//proc of the 29th Annual ACM Symp on Theory of Computing. New York: ACM, 1997: 654-663.
  • 9Chang F, Dean J, Ghemawat S, et al. Bigt able , A distributed storage system for structured data [J]. ACM Trans on Computer Systems, 2008, 26(2): 1-26.
  • 10Apache. HBase [EB/OL]. [2011-12-20]. http: //hbase. apache. org/.

二级参考文献30

  • 1Wikipedia. Cloud computing [EB/OL]. [ 2008-11 -16 ]. http ://en. wikipedia, org/wiki/Cloud computing.
  • 2Ghemawat S, Gobioff H, Leung S. The Google file system [C] //Proc of the 19th ACM Symp on Operating System Principles(SOSP). New York, ACM, 2003:29-43.
  • 3Dean J, Ghemawat S. MapReduee: Simplified data processing on large clusters [C] //Proc of the 6th USENIX Symp on Operating Systems Design and Implementation (OSDI). San Francisco: USENIX Association, 2004: 137- 150.
  • 4Chang F, Dean J, Ghemawat S. et al. Bigtable: A distributed storage system for structured data [C] //Proc of the 7th USENIX Syrup on Operating Systems Design and Implementation(OSDI). San Francisco: USENIX Association, 2006:205-218.
  • 5Amazon Web Services. Amazon Elastic Compute Cloud [EB/OL]. [2008-12-01]. http://aws, amazon, com/cc2/.
  • 6Amazon Web Services. Amazon Simple Storage Service [EB/OL]. [2008- 12-01]. http://aws, amazon, com/s3/.
  • 7Patterson D, Technical perspective: The data center is the computer[J]. Communications of the ACM, 2008, 51(1) 105-105.
  • 8Bryant R. Data intensive supercomputing: The case for DISC [R/OL]. [2008-12- 10]. http://www, cs. cmu. edu/-bryant/ pubdir/emu cs 07-128. pdf.
  • 9Bell G, Gray J, Alex S. Petascale computational systems: Balanced cyberInfrastructure in a data centric world [J].Computer, 2006, 39(1): 110-112.
  • 10Newman H, Ellisman M, Orcutt J. Data-intensive e-science frontier research [J]. Communications of the ACM, 2003, 46(11) :68-77.

共引文献38

同被引文献106

  • 1吴广君,王树鹏,陈明,李超.海量结构化数据存储检索系统[J].计算机研究与发展,2012,49(S1):1-5. 被引量:30
  • 2董朝霞,杨峰,范斗.基于知识的多源异构信息一体化研究[J].电网技术,2004,28(17):67-71. 被引量:6
  • 3刘卫昌,马增良.企业综合自动化系统中实时数据库系统设计[J].计算机应用研究,2005,22(8):146-149. 被引量:7
  • 4许向阳,李明胜.位图索引的设计与实现[J].微计算机应用,2006,27(2):188-191. 被引量:1
  • 5Tunkelang D. Recommendations as a conversation with the user [C] //Proc of the 5th ACM Conf on Recommender Systems. New York: ACM, 2011: 11-12.
  • 6Breese J, Heckerman D, Kadie C. Empirical analysis of predictive algorithms for collaborative filtering [C] //Proc of the 14th Conf on Uncertainty in Artificial "Intelligence. San Francisco: Morgan Kaufmann, 1998: 43-52.
  • 7Adomavicius G, Tuzhilin A. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions [J]. IEEE Trans on Knowledge and Data Engineering, 2005, 17(6): 734-749.
  • 8Manzato M. Goularte R. A multimedia recommender system based on enriched user profiles [C] //Proc of the 27th Annual ACM Symp on Applied Computing. New York: ACM. 2012: 975-980.
  • 9Mell P. Grance T. The NIST definition of cloud computing. SP800-145 [R]. Gaithersburg: National Institute of Standards and Technology. 2011.
  • 10Ghemawat S. Gobioff H. Leung S. The Google file system [C] //Proc of the 19th ACM Syrnp on Operating Systems Principles. New York: ACM. 2003: 29-43.

引证文献14

二级引证文献89

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部