摘要
以图结构来描述实体间复杂的关联关系被广泛应用于多种不同的领域.但是,随着这些领域的蓬勃发展,图结构数据的数据量也与日俱增.如何根据用户提交的查询图,在大规模数据图上高效地返回满足用户要求的匹配成为目前学术界和工业界首要的研究问题.然而,之前的工作,多数都是在无权图上查询,没有考虑用户的个性化需求,并且算法运行在大规模数据图上的执行时间并不是很理想.提出一个适用于有权查询图并且适用于大规模数据图上查询的个性化子图匹配算法(personalized subgraph matching,PSM).首先,通过已有的社团检测GN算法将数据图划分成若干个子区域,并构建2个线下索引:GP-Tree索引和排序边集索引(sorted lists index,SL);然后,基于索引结构,通过增加优化策略进而加速子图匹配;最后,本文通过大量实验验证了本文算法的有效性和扩展性.
Graph is widely used in various fields to describe the complex relationships between entities.However,along with the development of these fields,data size is also increasing,how to efficiently find matches that meet usersrequirements on large-scale data becomes a critical problem on current academic and industrial research.And previous work on subgraph matching dont consider user-defined requirements for some special edges so they are powerless against query graphs with weights and existing algorithms are not suitable for large-scale graphs.In this paper,we propose personalized subgraph matching(PSM)algorithm to adapt to weighted query and large-scale graphs.Firstly,we partition the large graph into some sub-regions by GN algorithm and build two index structures:GP-Tree and sorted lists index(SL)which are constructed offline.Then,based on the index structures,we use optimization strategies into our method to accelerate subgraph matching.Finally,our extensive experimental results on real and synthetic data sets evaluate the efficiency and scalability of our proposed method.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2015年第S1期48-55,共8页
Journal of Computer Research and Development
基金
黑龙江省自然科学基金项目(F201206)
黑龙江省高校科技创新团队建设计划基金项目(2013TD012)
关键词
子图匹配
图数据
图模式匹配
图分割
索引技术
subgraph matching
graph data
graph pattern matching
graph partition
index technology