期刊文献+

一个基于hadoop的并行社交网络挖掘系统 被引量:10

A Hadoop Based Parallel Social Network Mining System
下载PDF
导出
摘要 最近几年,以微博为首的社交网络迅猛发展,这些平台上包含了网民对于时事热点的观点,对生活和人际关系的看法等大量有价值的信息和资源。由于微博数据非常庞大又难以获取等困难,如何有效地对社交网络进行数据挖掘,是近两年数据挖掘研究的重点和热点。本工作设计和实现了一个基于Hadoop的并行社交网络挖掘系统,包含了分布式数据库,并行爬虫,并行数据处理和并行数据挖掘算法集,可以有效地获取和分析挖掘海量的社交网络数据,为社团分析,用户行为分析,用户分类,微博分类等工作提供支持。 In recent years, the social networks such as microblogging have been developed really well. These Platforms contains the views of hotspot of current events from millions of users and the relationship between them. These information are quiet valuable and important. The problem that how to do work about microblogging dataming has been a research hotpot in recent 2 years because of the microblogging mess data. In this paper’s work, we designed and implemented a parallel a social network datamining system based on Hadoop. This System include a distributed database, parallel crawler, parallel data processing and parallel datamining algorithms that efifciently access and analyze vast amounts of social network data and be a support for society analysis, user behavior analysis, user classiifcation.
作者 李冠辰
出处 《软件》 2013年第12期127-131,共5页 Software
关键词 计算机应用技术 HADOOP平台 社交网络 数据挖掘 Computer Application Technology Hadoop Platform Social Network Data Mining
  • 相关文献

参考文献8

  • 1PAGE L,BRIN S,MOTWANI R. The PageRank citation ranking:bringing order to the web[J].1999.68-70.
  • 2张晔,魏然,谷延锋,严萌.基于小波变换的光谱异常特征分析及提取技术研究[J].新型工业化,2013,2(1):38-45. 被引量:7
  • 3袁佳,郭燕慧.基于rabbitmq的海量日志的分布式处理[J].软件,2013,34(7):19-23. 被引量:20
  • 4BARABASI A L,JEONG H,NEDA Z. Evolution of the social network of scientific collaborations[J].Physica A:Statistical Mechanics and its Applications,2002,(03):590-614.
  • 5KOSSINETS G,WATTS D J. Empirical analysis of an evolving social network[J].{H}SCIENCE,2006,(5757):88-90.
  • 6ZHUANG L,DUNAGAN J,SIMON D R. Characterizing Botnets from Email Spam Records[J].LEET,2008.1-9.
  • 7DEAN J,GHEMAWAT S. MapReduce:simplified data processing on large clusters[J].{H}Communications of the ACM,2008,(01):107-113.
  • 8ANIL R,DUNNING T,FRIEDMAN E. Mahout in action[M].Manning,2011.

二级参考文献14

  • 1吴家骥,吴成柯.基于Karhunen-Loeve和小波变换的多光谱图像三维集合嵌入块编码压缩算法[J].电子与信息学报,2005,27(8):1244-1248. 被引量:3
  • 2李璐,张广泉.消息中间件的体系结构研究[J].苏州大学学报(工科版),2007,27(3):10-14. 被引量:15
  • 3Advanced message queuing protocol website[EB/ OL] wnv. amqp. org.
  • 4VINOSKI S. Advanced message queuing protocol[J] IEEE Internet Computing, 2006,10(6):87-89.
  • 5rabbitmq cluster v~r~.rabbitmq, com/clustering. html(2013).
  • 6http://www, kermit, fr/kermit/doc/mcollective/ cluster, html(2013).
  • 7TONG Q;ZHANG B;ZHENG L.Hyperspectral Remote Sensing:Principle,Technology and Application (高光谱遥感:原理,技术与应用)[M]Beijing:Higher Education Press(北京:高等教育出版社),2006.
  • 8张晔.信号时频分析及应用[M]哈尔滨:哈尔滨工业大学出版社,2006.
  • 9魏然.基于三维光谱模型的高光谱图像压缩方法的研究[D]哈尔滨:哈尔滨工业大学,2010.
  • 10Villasenor J D,Belzer B,Liao J. Wavelet filter evaluation for image compression[J].IEEE Transactions on Image Processing,1995,(08):1053-1060.doi:10.1109/83.403412.

共引文献25

同被引文献96

  • 1张栋梁,谭永杰.云计算中负载均衡优化模型及算法研究[J].软件,2013,34(8):52-55. 被引量:17
  • 2Sanjay Ghemawat, Howard Gobioff, Shun-TAK Leung. The Google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles. New York: ACM, 2003: 29-43.
  • 3Apache Hadoop[EB/OL]. (2013-06-15). http:/Paadoop/apachc.org.
  • 4Capacity scheduler guide[EB/OL]. (2013-06-03)[2013-06-15]. http://hadoop.apache.org/docs/stable/capacity-scheduler.html.
  • 5Fair scheduler[EB/OL]. (2013-06-03)[2013-06-15]. http://hadoop.apache.org/docs/rl.l.2/fairscheduler.html.
  • 6BYNA S, CHEN Yong, SUN Xian-hc. A taxonomy of data prefetching mcchanisms[C]//Proc of International Symposium on Parallel Architecures, Algorithms, and Networks. Washington DC: IISEE Computer Society, 2008: 19-24.
  • 7IE Jiong, MENG Fan-jun, WANG Hai-long, ct el. Research on scheduling scheme for Hadoop clusters[C]//Pro of Procedia Computer Science. 2013: 2468-2471.
  • 8SEO S, JANG I, WOO K, et al. HPMR: prefetching and pre-shuffling in shared MapReduce computation environment[C]//Proc of IEEE International Conference on Cluster Computing. Washington DC: IEEE Computer Society, 2009: 1-8.
  • 9Matei Zaharia, Dhruba Borthakur, Joydeep SenSarma, et al. Delay schduling:a simple technique for achieving locality and fairness in cluster scheduling[C]//In Proceedings of the 5th European conference on Computer systems. New York: ACM, 2010: 265-278.
  • 10Aprigio Bezerra, PorfDio HemANdez, Antonio Espinosa, et al. Job scheduling for optimizing data locality in Hadoop clusters[C]//In Proceedings of the 20th European MPI Users' Group Meeting. New York: ACM, 2003: 271-276.

引证文献10

二级引证文献54

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部