摘要
融合了关联规则挖掘的FP-Tree算法和图论的极大团理论的优势,做了以下主要工作:(1) 提出了用邻接矩阵的产生频繁2-项集的改进方法;(2) 提出了极大有序频繁集的概念,证明了Head关系的等价性、划分定理、局部复杂性定理和归并收敛值域定理;(3) 提出并实现了基于极大团划分的MaxCFPTree算法,扫描时间复杂性小于O(n2);(4) 做了相关实验,以验证算法的正确性.新方法缓解了项目数量巨大而内存不足的矛盾,提高了系统效率和伸缩性.
This paper integrates the advantage of the FP-Tree algorithm for mining association rules and the maximum clique theory of graph. The main contributions include: (1) An improved method to mine frequent 2-itemset by adjacency matrix is proposed. (2) The concept of maximum ordered frequent itemset is proposed, and the equivalence of Head Relation is proved as along with the theorems about Local Complexity and Merge Convergence Range. (3) The MaxCFPTree algorithm based on Maximum-clique partition is proposed and implemented with complexity O(n2). (4) The algorithms are validated by extensive experiments. The conflict between memory and huge number of items is resolved, and the system efficiency and scalability are improved.
出处
《软件学报》
EI
CSCD
北大核心
2004年第8期1198-1207,共10页
Journal of Software
基金
国家自然科学基金
国家教育部博士点专项基金
广西自然科学基金~~