期刊文献+

大规模数据图上的个性化子图匹配算法 被引量:5

Personalized Subgraph Matching in Large-Scale Graphs
下载PDF
导出
摘要 以图结构来描述实体间复杂的关联关系被广泛应用于多种不同的领域.但是,随着这些领域的蓬勃发展,图结构数据的数据量也与日俱增.如何根据用户提交的查询图,在大规模数据图上高效地返回满足用户要求的匹配成为目前学术界和工业界首要的研究问题.然而,之前的工作,多数都是在无权图上查询,没有考虑用户的个性化需求,并且算法运行在大规模数据图上的执行时间并不是很理想.提出一个适用于有权查询图并且适用于大规模数据图上查询的个性化子图匹配算法(personalized subgraph matching,PSM).首先,通过已有的社团检测GN算法将数据图划分成若干个子区域,并构建2个线下索引:GP-Tree索引和排序边集索引(sorted lists index,SL);然后,基于索引结构,通过增加优化策略进而加速子图匹配;最后,本文通过大量实验验证了本文算法的有效性和扩展性. Graph is widely used in various fields to describe the complex relationships between entities.However,along with the development of these fields,data size is also increasing,how to efficiently find matches that meet usersrequirements on large-scale data becomes a critical problem on current academic and industrial research.And previous work on subgraph matching dont consider user-defined requirements for some special edges so they are powerless against query graphs with weights and existing algorithms are not suitable for large-scale graphs.In this paper,we propose personalized subgraph matching(PSM)algorithm to adapt to weighted query and large-scale graphs.Firstly,we partition the large graph into some sub-regions by GN algorithm and build two index structures:GP-Tree and sorted lists index(SL)which are constructed offline.Then,based on the index structures,we use optimization strategies into our method to accelerate subgraph matching.Finally,our extensive experimental results on real and synthetic data sets evaluate the efficiency and scalability of our proposed method.
出处 《计算机研究与发展》 EI CSCD 北大核心 2015年第S1期48-55,共8页 Journal of Computer Research and Development
基金 黑龙江省自然科学基金项目(F201206) 黑龙江省高校科技创新团队建设计划基金项目(2013TD012)
关键词 子图匹配 图数据 图模式匹配 图分割 索引技术 subgraph matching graph data graph pattern matching graph partition index technology
  • 相关文献

参考文献15

  • 1Girvan M,Newman M E J.Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America . 2002
  • 2Han W-S,Lee J,Lee J-H.TurboISO:Towards ultra Fast and robust subgraph isomorphism search in large graph databases. SIGMOD Record . 2013
  • 3Zhao Sun,Hongzhi Wang,Haixun Wang,Bin Shao,Jianzhong Li.Efficient subgraph matching on billion node graphs. Proceedings of the VLDB Endowment . 2012
  • 4S. Zhang,S. Li,J. Yang.Gaddi:Distance Index Based Subgraph Matching in Biological Networks. Proc. of EDBT . 2009
  • 5Peixiang Zhao,Jiawei Han.On graph query optimization in large networks. Proceedings of the VLDB Endowment . 2010
  • 6Luigi P. Cordella,Pasquale Foggia,Carlo Sansone,Mario Vento.A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence . 2004
  • 7张硕,高宏,李建中,邹兆年.不确定图数据库中高效查询处理[J].计算机学报,2009,32(10):2066-2079. 被引量:24
  • 8J. R. Ullmann.An Algorithm for Subgraph Isomorphism[J]. Journal of the ACM (JACM) . 1976 (1)
  • 9T. Lappas,L. Liu,E. Terzi.'Finding a team of experts in social networks'. Proc.of KDD’’09 . 2009
  • 10Ren X,Wang J.Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. Proceedings of the VLDB Endowment . 2015

二级参考文献15

  • 1周水庚 蔚赵春 蒋豪良.图结构数据搜索的概念、问题与进展[J].中国计算机学会通讯,2007,3(8):59-65.
  • 2Shasha D, Wang T L J, Giugno R. Algorithmics and applications of tree and graph searching//Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. Madison, 2002:39-52.
  • 3Yan X, Yu P S, Han J. Graph indexing: A frequent structure-based approach//Proeeedings of the 2004 ACM SIGMOD International Conference on Management of Data. Paris, 2004:335-346.
  • 4Saito R, Suzuki H, Hayashizaki Y. Interaction generality, a measurement to assess the reliability of a protein-protein interaction. Nucleic Acids Research, 2002, 30(5): 1163-1168.
  • 5李建中 于戈 周傲英.不确定性数据管理的要求与挑战[J].中国计算机学会通讯,2009,5(4):6-14.
  • 6高宏 张炜.不确定图数据管理研究现状[J].中国计算机学会通讯,2009,5(4):31-36.
  • 7Dalvi N N, Suciu D. Efficient query evaluation on probabilistic databases. VLDB Journal, 2007, 16(4): 523-544.
  • 8Soliman M A, Ilyas I F, Chang K C. Top-k query processing in uncertain databases//Proceedings of the 23rd International Conference on Data Engineering. Istanbul, 2007:896-905.
  • 9Hintsanen P. The most reliable subgraph problem//Proceedings of the llth European Conference on Principles and Practice of Knowledge Discovery in Databases. Warsaw, 2007: 471-478.
  • 10Hintsanen P, Toivonen H. Finding reliable subgraphs from large probabilistic graphs. Data Mining and Knowledge Discovery, 2008, 17(1): 3-23.

共引文献25

同被引文献33

引证文献5

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部