摘要
随着XML逐渐成为Internet数据表示与交换的标准,如何快速准确地访问XML文档中的数据已成为亟待解决的关键问题,建立路径索引是提高查询效率的一种重要手段.本文设计了一种基于PATRICIA-TRIES的路径索引,简称PT索引.该索引有如下特点一、基于PATRICIA-TRIES结构,实现快速检索.二、采用压缩编码能够将路径索引放入内存,三、索引含有结构和文本信息,通过查询索引就能提供结果,无需打开原文档.其后,分析了PT索引的时间和空间复杂性,并与三种的典型的索引结构进行了对比实验,结果证明了其在路径查询方面具有更高的效率.
with the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. This poses a new challenge concerning indexing and searching XML data, because conventional approaches based on relational model may not meet the processing requirements for XML data. In this paper, we propose a path index based on PATRICIA tries, namely PT index, Our PT index structure offers several novel features. First, it can support to fast search data by its structure based on PATRICIA-tries. Second, the path indexes are compressed so that they can be store in memory. Thirdly, because PT index includes structure and text of XML data, we can achieve results form the PT index without reading original XML data. We address time complexity and space complexity of PT index. Experirnental results from our prototype System implementation show that the PT index can outperform some representative index approaches, such as DataGuide, B+tree index and Index Fabric.
出处
《小型微型计算机系统》
CSCD
北大核心
2006年第3期474-480,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(60373018)资助
国家"十五"重大科技攻关项目(2001BA101A03)资助.