期刊文献+

基于云计算的大数据信息检索技术研究 被引量:9

Technology Research of Large Data Information Retrieval Based on Cloud Computing
下载PDF
导出
摘要 随着云计算的快速发展,信息呈现爆炸式增长。廉价的云存储和计算能力,加速了大数据的产生,也使得解决大数据的信息收集和信息检索成为必然。大数据超过50%是非结构化数据,所以它们绝大部分以文件的形式存储。大数据被分成许多块存储在块服务器中,同时也产生存储在主服务器上的相应元数据。该文就如何收集大数据的web-url及关键词,又如何检索其中的信息,作了探讨。 With the rapid development of cloud computing,information increases rapidly.Cheap cloud storage and computing accelerates the data's generation.It also makes that the solution to large data information collection and information retrieval has become inevitable.Over 50 percent of large data is non-structured,so the majority of them are stored as files.Big data is divided into many blocks stored in a block server.And at the same time it also generates the corresponding metadata stored on the master server.This article discussed on how to collect web-url and its keyword of big data and how to retrieve its information.
作者 吴雪琴 舒晓苓 WU Xue-qin, SHU Xiao-ling (Computer Department of Sichuan TOP IT Vocational Institute, Chengdu 611743, China)
出处 《电脑知识与技术》 2014年第4期2388-2390,共3页 Computer Knowledge and Technology
关键词 云计算 大数据 信息收集 检索机制 cloud computing big data information collection retrieval mechanism
  • 相关文献

参考文献2

  • 1Google File System(GFS).http://wenku.baidu.com/view/8a839535ee06eff9aefS074d.html.2012.
  • 2谷歌搜索引擎工作原理简介.http://wenku.baidu.com/view/ff86db2ced630b1c59eetl56a.html.

同被引文献45

  • 1Hadoop.Apache Hadoop[EB/OL]. [2011-12-27]. http://hadoop.apache.org.
  • 2BORTHAKUR D. The hadoop distributed file system: Architecture and design[Z]. Hadoop Project Website, 2007: 1-10.
  • 3DEAN J, GHEMAWAT S. MapReduce: simplified data processing on large clusters[J]. In Proceedings of Operating Systems Design and Implementation (OSDI), San Francisco,USA,2004, 51(1):107-113.
  • 4POTTHAST M, GOLLUB T, HAGEN M, et al. Overview of the 4th International Competition on Plagiarism Detection[C]//CLEF 2012 Conference and Labs of the Evaluation Forum. Rome, Italy, 2012: 1-9.
  • 5XU Jinxi, CALLAN J. Effective retrieval with distributed collections[C]//Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA, 1998: 112-120.
  • 6ICHIKAWA Y, UEHARA M. Distributed search engine for an IaaS based cloud[C]// 2011 International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA). Washington D C, USA, 2011: 34-39.
  • 7PARO A. ElasticSearch cookbook[M]. Bermingham: Packt Publishing Ltd, 2013: 5-25.
  • 8Lemur. ClueWeb[EB/OL]. [2009-2-24]. http://lemurproject.org.
  • 9宗凯韵.基于大数据的用户信息检索行为分析[J].华东理工大学,2015,4(13):33-36.
  • 10Arinto Murdopo, distributed Decision Tree Learning for Mining Big Data Streams [ J ]. master of Science Thesis, European Master in Distributed Computing, 2013,07:21 - 22.

引证文献9

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部