摘要
XML数据索引对其检索效率有较大的影响。在深入分析现有XML结构索引之后,结合XML文档特点,提出了一种基于关键字检索的结构索引——LSS(Level Structure Summary)。LSS采用了把具有相同标签路径的结点进行合并的策略,具有高效判断结点之间同构异构关系的能力。实现了LSS索引生成算法CSCAN,并在LSS索引的基础上设计了XML关键字检索算法LSSearch。该算法依据LSS索引,将各个关键字的原始倒排表集合分拆成不同类型的子集合,最后在所有子集合上进行查询。实验结果表明,LSS可以帮助减少XML文档中关键字倒排表的规模,提高检索效率。
The index of XML Data is crucial for retrieval efficiency of XML document.After analysis of existing XML structure summaries,this paper proposed a structural summary over
出处
《计算机科学》
CSCD
北大核心
2010年第12期120-124,共5页
Computer Science
基金
863国家重点基金项目(2009AA1Z134)
国家自然科学基金(60803043
60720106001)资助
关键词
XML
关键字检索
索引
倒排表
search called LSS combining the XML document.LSS merges the nodes in the XML tree with the same label path so as to determine nodes' homogeneity and heterogeneity efficiently.This paper implemented LSS constructing algorithm called CSCAN
and designed a XML keyword retrieval algorithm called LSSearch based on LSS.This algorithm split keywords' inverted list into different type subsets
finally retrieved to get all results quickly on these subsets.Experimental results demonstrated that LSS can help to reduce the size of the keyword inverted list in XML document dramatically and improve retrieval efficiency.Keywords XML
Keyword search
Indices
Inverted list