The number of frequent subtrees usually grows exponentially with the tree size because of combinatorial explosion. As a result, there are too many frequent subtrees for users to manage and use. To solve this problem, ...The number of frequent subtrees usually grows exponentially with the tree size because of combinatorial explosion. As a result, there are too many frequent subtrees for users to manage and use. To solve this problem, we generalize a compressed frame based on δ-cluster to the problem of compressing frequent-subtree sets, and propose an algorithm RPTlocal which can mine compressed frequent subtrees set directly. This algorithm sacrifices the theoretical bounds but still has good compression quality. By pruning the search space and generating frequent subtrees directly, this algorithm is also efficient. Experiment result shows that the representative subtrees mining by RPTlocal is almost two orders of magnitude less than the whole collection of the closed subtrees, and is more efficient than CMtreeMiner, the algorithm for mining both closed and Maximal frequent subtrees.展开更多
基金Supported by the National Natural Science Foundation of China (70371015)
文摘The number of frequent subtrees usually grows exponentially with the tree size because of combinatorial explosion. As a result, there are too many frequent subtrees for users to manage and use. To solve this problem, we generalize a compressed frame based on δ-cluster to the problem of compressing frequent-subtree sets, and propose an algorithm RPTlocal which can mine compressed frequent subtrees set directly. This algorithm sacrifices the theoretical bounds but still has good compression quality. By pruning the search space and generating frequent subtrees directly, this algorithm is also efficient. Experiment result shows that the representative subtrees mining by RPTlocal is almost two orders of magnitude less than the whole collection of the closed subtrees, and is more efficient than CMtreeMiner, the algorithm for mining both closed and Maximal frequent subtrees.