摘要
在XML文档中,相当大的部分是由文本数据组成的,针对XML文本数据占用空间较大、对压缩文本数据有效搜索效率较低的难点,基于BWC提出了压缩XML文本数据索引的技术,通过构造全文本数据模型,并利用整体压缩自索引存储XML文档的文本数据,实验结果表明,该技术不仅有效支持XPath查询语言文本搜索,而且内存消耗相对较小,实现了中小规模数据的内存搜索.
A large number of fractions of an XML document are composed of text data.Considering the problems of the size of large XML document and less efficiency of effective searching on compressed text data,an index technology for compressed XML text data based on BWC is presented.The proposed technique is implemented by constructing a full text data model and in which the text data of XML document is stored with global compressed self-index.Experimental results shows,the proposed technique not only supports XPath query language search text effectively,but also needs fewer consumption of the memory so as to realize small and medium-scale data memory search.
出处
《昆明学院学报》
2011年第3期60-63,共4页
Journal of Kunming University
基金
安徽省自然科学研究资助项目(KJ2010B280)
关键词
自索引
后向搜索
文本数据
BWC
self-index
backward searching
text data
Burrows-Wheeler Compression