摘要
序列模式挖掘是Web日志挖掘中的一个重要范畴。针对Wap算法中递归构建大量条件树的这一缺陷,提出了一种改进算法NGCWAP。NGCWAP算法采用前序遍历号和后序遍历号来跟踪频繁序列分布在哪些后缀树集中,避免了条件树的构建,从而减少了内存消耗。通过实验验证了改进算法的正确性和高效性。
Sequential pattern mining is an important mining area of the Web log mining.Wap algorithm must construct a large number of conditional trees during mining,so this paper proposed an improved algorithm which name is NGCWAP to solve that problem.The NGCWAP algorithm uses the pre-order traversal number and post-order traversal number to trace the sub-trees in which candidates are located,which avoids construction of the conditional tree,thus the algorithm reduces memory consumption.The experiment results show accuracy and efficiency of the improved algorithm.
出处
《计算机科学》
CSCD
北大核心
2012年第2期206-208,239,共4页
Computer Science
基金
核高基重大专项项目(2009ZX01045-005-001)资助