Link patterns are consensus practices characterizing how different types of objects are typically interlinked in linked data. Mining link patterns in large-scale linked data has been inefficient due to the computation...Link patterns are consensus practices characterizing how different types of objects are typically interlinked in linked data. Mining link patterns in large-scale linked data has been inefficient due to the computational complexity of mining algorithms and memory limitations. To improve scalability, partitioning strategies for pattern mining have been proposed. But the efficiency and completeness of mining results are still under discussion. In this paper we propose a novel partitioning strategy for mining link patterns in large-scale linked data, in which linked data is partitioned according to edge-labeling rules: Edges are grouped into a primary multi-partition according to edge labels. A feedback mechanism is proposed to produce a secondary bi-partition according to a quick mining process. Local discovered link patterns in partitions are then merged into global patterns. Experiments show that our partition strategy is feasible and efficient.展开更多
基金supported by the National High-Tech Research and Development(863)Program of China(No.2015AA015406)the Open Project of Jiangsu Key Laboratory of Data Engineering and Knowledge Service(No.DEKS2014KT002)
文摘Link patterns are consensus practices characterizing how different types of objects are typically interlinked in linked data. Mining link patterns in large-scale linked data has been inefficient due to the computational complexity of mining algorithms and memory limitations. To improve scalability, partitioning strategies for pattern mining have been proposed. But the efficiency and completeness of mining results are still under discussion. In this paper we propose a novel partitioning strategy for mining link patterns in large-scale linked data, in which linked data is partitioned according to edge-labeling rules: Edges are grouped into a primary multi-partition according to edge labels. A feedback mechanism is proposed to produce a secondary bi-partition according to a quick mining process. Local discovered link patterns in partitions are then merged into global patterns. Experiments show that our partition strategy is feasible and efficient.