期刊文献+

基于形式概念分析的博客社区发现 被引量:1

Blog community detection based on formal concept analysis
下载PDF
导出
摘要 针对拖网算法存在的发现Web社区数量过多、社区间页面重复率较高以及严格的社区定义形成孤立社区等问题,提出一种基于形式概念分析(FCA)的博客社区发现算法。根据博客网络之间的链接关系构造概念格,通过格的代数消解对原始概念格进行等价划分,度量每个划分中概念间外延和内涵的结构相似性进而合并社区核心形成社区。实验结果表明:测试数据集中社区核心的网络密度大于40%的占全部的83.420%,合并社区的网络直径为3,且社区内容丰富程度得到提高。所提算法可以有效地运用于博客、微博等社交网络的社区发现,具有显著的应用价值和现实意义。 Several problems exist in trawling algorithm, such as too many Web communities, high repetition rate between community-cores and isolated community formed by strict definition of community. Thus, an algorithm detecting Blog community based on Formal Concept Analysis (FCA) was proposed. Firstly, concept lattice was formed according to the linkage relations between Blogs, then clusters were divided from the lattice based on equivalence relation, finally communities were clustered in each cluster based on the similarity of concepts. The experimental results show that, the community-cores, which network density is greater than 40%, occupied 83. 420% of all in testing data set, the network diameter of combined community is 3, and the content of community gets enriched significantly. The proposed algorithm can be effectively used to detect communities in Blog, micro-Blog and other social networks, and it has significant application value and practical meaning.
出处 《计算机应用》 CSCD 北大核心 2013年第1期189-191,198,共4页 journal of Computer Applications
基金 国家自然科学基金资助项目(61070122)
关键词 博客社区 社区发现 形式概念分析 链接分析 社交网络 Blog community community detection Formal Concept Analysis (FCA) link analysis social network
  • 相关文献

参考文献15

  • 1CARRINGTON P J, SCOTY J. The SAGE handbook of social net- work analysis [ M]. Los Angeles: SAGE Publications, 2011.
  • 2杨宇航,赵铁军,于浩,郑德权.Blog研究[J].软件学报,2008,19(4):912-924. 被引量:19
  • 3何镝,彭智勇,梅晓茸.Web社区管理研究综述[J].计算机科学与探索,2011,5(2):97-113. 被引量:12
  • 4TOYODA M, KITSUREGAWA M. Creating a Web community chart for navigating related communities [ C]///Proceedings of the 12th In- ternational Conference on World Wide Web. New York: ACM Press, 2001:103-112.
  • 5KUMAR R, RAGHAVAN P, RAJAGOPALAN S, et al. Trawling emerging cyber communities automatically [ C]//Proceedings of the 8th International Conference on World Wide Web. New York: Elsevier North-Holland, 1999:1481 - 1493.
  • 6FLAKE G W, LAWRENCE S, GILES C L. Efficient identification of Web communities [ C]// Proceedings of the 6th ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2000:150 - 160.
  • 7LIU X Y, LIN H F, ZHANG C. An improved HITS algorithm based on page-query similarity and page popularity [ J]. Journal of Com- puters, 2012, 7(1): 130-134.
  • 8PAPADOPOULOS S, KOMPATSIARIS Y, VAKALI A, et al. Com- munity detection in social media [ J]. Data Mining and Knowledge Discovery, 2012, 24(3): 515-554.
  • 9张金增,范明.一种改进的基于最大流的Web社区挖掘算法[J].计算机应用,2009,29(1):213-216. 被引量:2
  • 10杨楠,弓丹志,李忺,孟小峰.Web社区发现技术综述[J].计算机研究与发展,2005,42(3):439-447. 被引量:35

二级参考文献39

  • 1杨楠,弓丹志,李忺,孟小峰.Web社区发现技术综述[J].计算机研究与发展,2005,42(3):439-447. 被引量:35
  • 2高琰,谷士文,唐琎.基于链接分析的Web社区发现技术的研究[J].计算机应用研究,2006,23(7):183-185. 被引量:17
  • 3GIBSON D, KLEINBERG J M, RAGHAVAN P. Inferring Web communities from link topology [ C]//Proceedings of the 9th ACM Conference on Hypertext and Hypermedia. Pittsburgh: ACM Press, 1998:225 - 234.
  • 4KUMAR R, RAGHAvAN P, RAJAGOPALAN S, et al. Trawling the Web for emerging cybercommunities[ C]//Proceedings of the 8th International World Wide Web Conference. Toronto: Elsevier Science Press, 1999:403-415.
  • 5FLAKE G W, LAWRENCE S, GILES C L, et al. Efficient identification of Web communities[ C]//Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining. Boston: ACM Press, 2000: 150- 160.
  • 6FLAKE G W, LAWRENCE S, GILES C L. et al. Self-organization of the Web and identification of communities [ J]. IEEE Computer, 2002,35(3) :66 -71.
  • 7XING W P, GHORBANI A. Weighted PageRank Algorithm[ C]//Proceedings of the 2rid Annual Conference on Communication Networks and Services Research. Fredericton, Canada: IEEE Computer Society, 2004:305 - 314.
  • 8FORD L R, FULKERSON D R. Maximal flow through a network [ J]. Canadian Journal of Math-ematics, 1956 (8) :399 -404.
  • 9EDMONDS J, KARP R. Theoretical improvements in algorithmic efficiency for network flow problems [ J]. Journal of the ACM, 1972, 19(2) :248 -264.
  • 10IMAFUJI N, KITSUREGAWA M. Finding a Web community by maximum flow algorithm with HITS score based capacity [ C]// Proceedings of the Eighth International Conference on Database Systems for Advanced Applications, Washington, DC: IEEE Press, 2003:101 - 106.

共引文献67

同被引文献17

  • 1马垣,曾子维,迟呈英,等.形式概念分析及其新进展[M].北京:科学出版社,2011:55-81.
  • 2罗忠诚,张志强,王忠浩.语义背景下基于形式概念分析的产品族组件规划与类型识别研究[J].中国机械工程,2007,18(22):2729-2733. 被引量:2
  • 3Wille Rudolf. Restructuring lattice theory: an approach based onhierarchies of concepts [C] // Proceedings of the NATOAdvancedStudy Institute, Banff, Canada, 1982: 445-470.
  • 4Ganter B,StummeG,Wille R.Formal concept analysis: foundationsand applications [M]. Berlin Heidelberg: Springer-Verlag, 2005.
  • 5Wolff KE. Afirst course in formal concept analysis [C] //the 7thConference on the Scientific Use of Statistical Software, Heidelberg,1993: 429-438.
  • 6Ganter B, Wille R. Formal concept analysis: mathematical foundations[M]. Berlin: Springer-Verlag, 1999.
  • 7TomHuysegoms, Monique Snoeck, Guido Dedene, et al.. Visualizingvariability management in requirements engineeringthrough formal concept analysis [J]. Procedia Technology, 2013,9:189-199.
  • 8KorobkoAV, PenkovaTG. On-line analytical processing basedon formal concept analysis [J]. Procedia Computer Science, 2012,1(1): 2311-2317.
  • 9Doctorow C. Big data: Welcome to the petacenter [J]. Nature,2008,455 (7209): 16-21.
  • 10Reichman O J, Jones G A, Bony S, et al.. Challenges and opportunitiesof open data in ecology [J]. Science, 2011,331 (6018):703-705.

引证文献1

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部