期刊文献+

大数据领域的主题分析——基于WOS和Wikipedia的互证研究 被引量:4

Topic Analysis of Big Data Area:Research Based on WOS and Wikipedia
下载PDF
导出
摘要 利用WOS(Web of Science)和Wikipedia两种数据源,对大数据相关的内容进行词频统计、文本归类分析,得出两种数据源下大数据主题的共识和差异,并进一步梳理提炼出大数据领域的主题类别。共同的类别包括整体角度、技术层面、应用层面、实体和活动等,进一步细分的主题包括数据及数据源、大数据处理和分析技术、大数据系统和应用、国家地区以及企业的推动、社会和人的讨论、行业和学科变化等。最后论文还结合相关数据探讨了大数据领域的研究前沿。 This paper tries to get the consensus and difference of"big data" by analyzing the word frequency and text classification based on WOS ( Web of Science) and Wikipedia, and concludes the topics of"big data" and these subject categories. The common classifica-tion can be summarized as overall perspective, technical level, application level, entities and activities, etc. The main topics include data and data sources, big data processing and analysis technology, big data systems and applications, promotions by countries, regions and companies, social and human, and industrial and academic changes, etc. At last the paper explores the research frontiers of the"big data"by combining other data analysis.
作者 许鑫 冯诗惠
出处 《情报杂志》 CSSCI 北大核心 2014年第11期124-130,共7页 Journal of Intelligence
关键词 大数据 主题分析 WEB of SCIENCE WIKIPEDIA big data topic analysis Web of Science Wikipedia
  • 相关文献

参考文献12

  • 1Manyika J, Chui M, Brown B, et al. Big data: The Next Fron- tier for Innovation,Competition, and Productivity[ OL]. E2014- 07-25 ]. http://www, mckinsey, com/insights/business tech- nology/big_data the_next_frontier for_innovation.
  • 2USA: Executive Office of the President. Obama Administration Unveils "Big Data" Initiative: Announces S 200 Million in New R&D Investment [ EB/OL ]. [ 2014 - 07 - 25 ]. http;//www. whitehouse, gov/sites/default/files/microsites/ostp/big _ data _ press release, pdf.
  • 3隋利玲,郭瑜.Internet上专业性信息资源指引库的建设[J].现代图书情报技术,1997(2):20-23. 被引量:24
  • 4艾柯.专访维基百科创始人威尔士[OL].[2014-05-28].http://news.xinhuanet.com/newmedia/2005-09/01/content_3429648.htm.
  • 5Wikipedia [ OL ]. [ 2014 - 05 - 28 ]. http ://en. wikipedia, org/ wiki/Main_Page.
  • 6尹开国.自由人的自由联合:维基百科评介[J].图书情报工作,2007,51(2):142-144. 被引量:12
  • 7费洪晓,康松林,朱小娟,谢文彪.基于词频统计的中文分词的研究[J].计算机工程与应用,2005,41(7):67-68. 被引量:68
  • 8Xu Y, Reynolds N. Using Text Mining Techniques to Analyze Students" Written Responses to a Teacher Leadership Dilemma [ J]. International Journal of Computer Theory & Engineering, 2012, 4(4) :575-578.
  • 9李文兰,杨祖国.中国情报学期刊论文关键词词频分析[J].情报科学,2005,23(1):68-70. 被引量:214
  • 10Synnestvedt M B ,Chen C,Holmes J H. CiteSpace II: visualiza- tion and knowledge discovery in bibliographic databases [ C ]// AMIA Annual Symposium Proceedings. American Medical In- formaties Association,2005 : 724.

二级参考文献76

  • 1孙茂松,黄昌宁,邹嘉彦,陆方,沈达阳.利用汉字二元语法关系解决汉语自动分词中的交集型歧义[J].计算机研究与发展,1997,34(5):332-339. 被引量:66
  • 2Kleinberg J.Bursty and hierarchical structure in streams.Proceedings of Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(pp.91-101),Edmonton,Alberta,Canada:ACM Press,2002.
  • 3Freeman L C.Centrality in social networks:Conceptual clarification.Social Networks,1979,1:215-239.
  • 4Price D D.Networks of scientific papers.Science,1965,149:510-515.
  • 5van Raan A.On growth,ageing,and fractal differentiation of science.Scientometrics,2000,47(2):347-362.
  • 6Abt H A.Why some papers have long citation lifetimes.Nature,1998,395:756-757.
  • 7Bak P,Chen K.Self-organized criticality.Scientific American,1991,264(1):46-53.
  • 8Chen C,Morris S.Visualizing evolving networks:Minimum spanning trees versus pathfinder networks.Proceedings of IEEE Symposium on Information Visualization(pp.67-74),Seattle,Washington:IEEE Computer Society Press,2003.
  • 9Small H G,Griffith B C.The structure of scientific literatures i:Identifying and graphing specialties.Science Studies,1974,4:17-40.
  • 10Garfield E.Scientography:Mapping the tracks of science.Current Contents:Social & Behavioural Sciences,1994,7(45):5-10.

共引文献1696

同被引文献31

  • 1胡姗,张洋,燕达,郭偲悦,刘烨,江亿.中国建筑领域能耗与碳排放的界定与核算[J].建筑科学,2020,36(S02):288-297. 被引量:80
  • 2陈悦,陈超美,刘则渊,胡志刚,王贤文.CiteSpace知识图谱的方法论功能[J].科学学研究,2015,33(2):242-253. 被引量:7180
  • 3李艳,赵新力,齐中英.技术竞争情报的现状分析[J].情报学报,2006,25(2):242-253. 被引量:100
  • 4Manyika J, Chui M,Brown B, et al. Big Data:The Next Frontier for Innovation, Competition, and Productivity [R] ,2011.
  • 5工信部电信研究院.大数据白皮书[R],2014.
  • 6Ghemawat S, Gobioff H, Leung S T. The Google file system [C]//ACM SIGOPS Operating Systems Review. ACM, 2003, 37(5 ) :29-43.
  • 7Dean J, Ghemawat S. MapReduce:Simplified Data Processing on Large Clusters [ J ]. Communications of the ACM, 2008,51 ( 1 ) : 10/-113.
  • 8Chang F, Dean J, Ghemawat S, et al. Bigtable: A Distributed Storage System for Structured Data [J]. ACM Transactions on Computer Systems (TOCS) ,2008,26 (2) : 1-26.
  • 9Shvacb.ko K, Kuang H, Radia S, et al. The Hadoop Distributed file system [ C ]//Mass Storage Systems and Technologies (MSST) ,2010 IEEE 26th Symposium on IEEE[A] ,2010:1-10.
  • 10McAfee A, Brynjolfsson E. Big Data:the Management Revolu- tion [ J ]. Harvard Business Review,2012,90 ( 10 ) : 60 -68.

引证文献4

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部