摘要
对图数据频繁模式的挖掘是近年的研究热点 选择了惟一标号图进行分析 ,结合图论和频集生成的算法 ,提出了基于Aproiri思想、运用矩阵乘法的AMGM算法和基于SFP树的SFP算法 它们可有效地挖掘简单图中连通频繁子图 实验表明 ,这两个算法是十分有效的 ,其中SFP算法的性能优于AMGM 该算法还被运用于发现Web上的权威页面和社团 。
Mining the frequent pattern from data set is one of the key success stories of data mining research. Currently, most of the efforts are focused on the independent data such as the items in the marketing basket. However, the objects in the real world often have close relationship with each other. How to gain the frequent pattern from these relations is the objective of this paper. Graphs are used to model the relations, and a simple type is selected for analysis. Combining the graph-theory and algorithms to generate frequent patterns, two new algorithms are proposed. The first algorithm, named AMGM, is based on the Aproiri idea and makes use of matrix. For the second algorithm, a new structure SFP-tree and an algorithm, which can mine these simple graphs more efficiently, have been proposed. The performance of the algorithms is evaluated by experiments with synthetic datasets. The empirical results show that they both can do the job well, while SFP performs better than AMGM. Such algorithms are also applied in mining of the authoritative pages and communities on Web, which is useful for Web mining. At the end of the paper, the potential improvement is mentioned.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2005年第2期230-235,共6页
Journal of Computer Research and Development
基金
国家自然科学基金项目 (6993 3 0 10
60 3 0 3 0 0 8)
国家"八六三"高技术研究发展计划基金项目 (2 0 0 2AA4Z3 43 0 )
关键词
SFP树
频繁连通图
数据挖掘
SFP tree
connected frequent graph
data mining