期刊文献+

融合启发式和Boosting的子图匹配基数估计方法

Subgraph Matching Cardinality Estimation Combining Heuristic and Boosting Method
下载PDF
导出
摘要 由于在建模关联信息方面具备天然优势,图数据已在社交网络、知识表示等方面被广泛运用。但是相较于传统的关系型数据库系统,图数据管理中的以子图匹配为代表的一系列基础操作仍有进一步优化的空间。在一个完善的图数据库系统中,为实现多个子图匹配任务的优化调度,往往需要对每个任务的代价,尤其是匹配结果的基数进行准确预估。然而,现有的子图匹配基数预估方法缺乏对图结构信息的充分考量,且在多结点匹配中存在严重的潜在累计误差。BoostCard方法通过对各结点的邻域信息进行表示,来聚合结点的局部结构特征,同时运用统计方法估计不同结点之间连接成边的概率从而实现匹配基数的初步预测。而后在初期获取的结点结构特征的基础上,采用提升学习的思想对预测结果进行全局补偿,可实现智能化的子图匹配基数估计,是一种具有广泛适用性的子图匹配预测框架。通过实验可知,相比于传统的统计方法,BoostCard在真实数据集的子图匹配基数估计,尤其是多结点子图匹配问题上有明显的性能提升。 Attributed to its innate advantage in modeling relational information,graph data have been widely leveraged in various applications including social network,knowledge representation,etc.Compared with traditional relational database systems,primitive operators in graph data management,represented by subgraph matching,still observe space for further optimization.In a fully-fledged graph database system,in order to optimize the schedule of multiple subgraph matching tasks,it usually necessitates accurate estimation of the cost of every task,especially of the cardinality of matching results.However,current methods for cardinality estimation of subgraph matching fail to fully exploit the structural information in the graph,and moreover,it may potentially result in serious error accumulation.In this light,BoostCard is proposed to aggregate the local structural features via representing neighborhood information of nodes.Meanwhile,it utilizes statistical method to predict whether there is an edge between two nodes.And it adopts the Boosting strategy to compensate globally based on the acquired features of nodes.Hence,it can achieve intelligent cardinality estimation of subgraph matching,making itself a widely applicable framework for the task.In comparison with conventional statistic-based methods,BoostCard offers significant performance gain in the cardinality estimation of multi-node subgraph matching over real-life data sets.
作者 侯文哲 赵翔 HOU Wenzhe;ZHAO Xiang(Science and Technology on Information Systems Engineering Laboratory,National University of Defense Technology,Changsha 410000,China)
出处 《计算机科学与探索》 CSCD 北大核心 2022年第3期582-590,共9页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金(61872446,61902417,71971212) 湖南省自然科学基金(2019JJ20024)。
关键词 图数据 子图匹配 基数估计 提升学习 graph data subgraph matching cardinality estimation Boosting learning
  • 相关文献

参考文献1

二级参考文献64

  • 1Microsoft Academic Search. Explore researchers' cooperating network.[2009-12-01 J.[2014-11-20]. http://academic. research. microsoft. com/VisualExplorer.
  • 2Brynielsson J, Hogberg J, Kaati L, et al. Detecting social positions using simulation[C]//Proc of 2010 Int Conf on Advances in Social Networks Analysis and Mining (ASONAM). Alamitos, CA: IEEE, 2010: 48-55.
  • 3Palantir. Products Built for A Purpose.[2004-01-01].[2014-11-20]. https://www.palantir.com/.
  • 4Malewicz G, Austern M H, Bik A J C, et al, Pregel , A system for large-scale graph processing[C]//Proc of the 2010 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2010: 135-146.
  • 5Sarwat M, Elnikety S, He Y, et al. Horton: Online query execution engine for large distributed graphs[C]//Proc of the 28th IEEE Int Conf on Data Engineering (ICDE J. Alamitos, CA: IEEE, 2012: 1289-1292.
  • 6Low y, Gonzalez J, Kyrola A, et al. Graphlab , A new framework for parallel machine learning[C]//Proc of the 26th Conf on Uncertainty in Artificial Intelligence (UA]). Oregon, USA: AUAI, 2010.
  • 7Michael R G, David S J. Computers and intractability: A guide to the theory of NP-completeness[R]. New York: W. H. Freeman Company, 1979.
  • 8Christmas W J, Kittler J, Petrou M. Structural matching in computer vision using probabilistic relaxation[J]. Pattern Analysis and Machine Intelligence, 1995, 17(8): 749-764.
  • 9Ullmann J R. An algorithm for subgraph isomorphism[J]. Journal of the ACM (JACM), 1976,23(1): 31-42.
  • 10Cordelia L P, Foggia r. Sansone C, et al. A (sub) graph isomorphism algorithm for matching large graphs[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2004, 26(10): 1367-1372.

共引文献35

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部