期刊文献+

大数据并行计算框架 被引量:6

Parallel computing framework for big data
原文传递
导出
摘要 大数据是当前IT信息技术研究和应用的热点,但目前的研究多集中在系统和应用层面,而理论基础研究方面相对较少.本文以计算复杂性理论为基础,针对大数据量大、快速和多样性等挑战,着重研究大数据的可计算性及其计算原理.首先将多种类型的大数据抽象到度量空间进行统一化表示以解决多样性问题,其次在度量空间中基于距离对大数据进行划分,最后运用NC类计算理论等并行计算理论和方法对大数据问题进行并行求解,以解决量大和快速等问题.本文从更广的视角,根据大数据的特性和大数据整个生命周期,提出处理大数据的策略和技术以及需要变革思维方法研究大数据. Big data has received a great deal of attention with respect to its use in research and application in information technology. However, most current efforts focus on systems and applications instead of the theoretical foundation. Based on computational complexity theory, according to the volume, velocity, and variety challenges of big data, we study the computability and computational principles of big data. First, various types of big data can be abstracted into metric space for universal representation to handle the variety challenge. Big data can then be partitioned in metric space according to distance. Finally, NC-class computing theory can be applied to solve big data problems in parallel and handle the volume and velocity challenges. Last of all, from a wider perspective, we propose a processing strategy according to the challenges and innovation of big data research methodology.
出处 《科学通报》 EI CAS CSCD 北大核心 2015年第5期566-569,共4页 Chinese Science Bulletin
基金 国家高技术研究发展计划(2012AA01A309) 国家自然科学基金委-广东联合基金(U1301252) 国家自然科学基金(61170076 61471243) 广东省重点实验室建设项目(2012A061400024) 深圳市科技计划项目(JCYJ20120613155632545 SGLH20131010163759789 JCYJ2014 0418095735561)资助
关键词 NC类计算 度量空间 数据划分 可计算性 NC-class computation, metric space, data partitioning, computability
  • 相关文献

参考文献13

  • 1Tang X, Li K, Zeng Z, et al. IEEE Trans Computers, 2011, 7: 1017-1029.
  • 2Fan W, Geerts F, Neven F. Making queries tractable on big data with preprocessing. In: Proceeding of the 39th International Conference on Very Large Data Bases (VLDB), 2013. 685-696.
  • 3陈国良. 并行算法的设计与分析. 北京: 高等教育出版社, 2011.
  • 4熊金城, 吕杰, 谭枫, 译. 拓扑学. 北京: 机械工业出版社, 2013.
  • 5Chavez E, Navarro G, Baeza-Yates R, et al. ACM Computing Surveys, 2001, 33: 273-321.
  • 6Zezula P, Amato G, Dohnal V, et al. Similarity Search: The Metric Space Approach. Heidelberg: Springer, 2006.
  • 7Mao R, Miranker W, Miranker D P. J Discrete Algor, 2012: 32-46.
  • 8Benjamin B, Navarro G, Chavez E. Pattern Recogn Lett, 2003, 24: 2357-2366.
  • 9Mao R, Liu S, Xu H L, et al. On data partitioning in tree structure metric-space indexes. In: Proceeding of the 19th International Conference on Database Systems for Advanced Applications (DASFAA2014), 2014. 141-155.
  • 10Uhlmann J K. Inform Proc Lett, 1991, 40: 175-179.

同被引文献56

引证文献6

二级引证文献78

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部