摘要
为解决带标号的有根无序树的数据库的索引问题,提出一种新的索引方法,首先挖掘频繁子树,并从中挑选出有判别力的子树作为索引属性,然后将索引属性集合中的子树转换成序列,并将索引组织成前缀树的形式.给出了在此类索引树中进行搜索的算法,并用Apriori剪枝和最大的有判别力的子树来减小搜索空间.实验结果表明:与其他基于路径的索引方法相比,这种基于频繁子树的数据库索引在索引大小和查询代价两方面都有较好的优越性.
A new indexing method is proposed to solve the problem of indexing labeled rooted unordered trees. In this method, all frequent suhtrees are generated and discriminative suhtrees are selected among them as indexing features; suhtrees in the feature set into sequences are translated and held in a prefix tree. An algorithm of searching in the index tree is also proposed and there are two optimal techniques: apriori pruning and maximum discriminative subtrees, to reduce the search space. Experi- mental results show that our frequent suhtree-hased indexing method performs better and consumes less space than other path-based indexing methods.
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2008年第3期103-106,共4页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
关键词
数据挖掘
频繁子树
数据库索引
子树搜索
索引树
data mining
frequent subtree
database indexing
suhtree search
index tree