A new way of indexing and processing twig patterns in an XML documents is proposed in this paper. Every path in XML document can be transformed into a sequence of labels by Structure-Encoded that constructs a one-to-o...A new way of indexing and processing twig patterns in an XML documents is proposed in this paper. Every path in XML document can be transformed into a sequence of labels by Structure-Encoded that constructs a one-to-one correspondence between XML tree and sequence. Base on identifying characteristics of nodes in XML tree, the elements are classified and clustered. During query proceeding, the twig pattern is also transformed into its Structure-Encoded. By performing subsequence matching on the set of sequences in XML documents, all the occurrences of path in the XML documents are refined. Using the index, the numbers of elements retrieved are minimized. The search results with pertinent format provide more structure information without any false dismissals or false alarms. The index also supports keyword search Experiment results indicate the index has significantly efficiency with high precision.展开更多
路径查询是XML查询的一个主要特征,现已提出了多种XML索引方法.DTD的结构信息对于XML索引的建立及查询效率的提高很重要,但现有的大部分索引方法没有利用DTD这一有效资源.提出一种利用DTD的XML索引方法--DBXI(DTD-based XML indexing),...路径查询是XML查询的一个主要特征,现已提出了多种XML索引方法.DTD的结构信息对于XML索引的建立及查询效率的提高很重要,但现有的大部分索引方法没有利用DTD这一有效资源.提出一种利用DTD的XML索引方法--DBXI(DTD-based XML indexing),该方法采用了新的编码方法,可使路径查询具备如下特征:对于由N个元素/属性组成的具有1个谓词约束的路径表达式,DBXI处理每个XML文档仅需0次或1次元素/属性结点集的结构连接操作;对于在XML文档中不存在匹配结构的路径查询,DBXI能够在比现有的XML索引方法较短的时间内给出无查询结果的判断.实验表明,与Lore,SphinX和XISS等索引方法相比,DBXI能够缩短路径查询的响应时间.展开更多
基金Supported by the National Natural Science Foundation of China (60473085)
文摘A new way of indexing and processing twig patterns in an XML documents is proposed in this paper. Every path in XML document can be transformed into a sequence of labels by Structure-Encoded that constructs a one-to-one correspondence between XML tree and sequence. Base on identifying characteristics of nodes in XML tree, the elements are classified and clustered. During query proceeding, the twig pattern is also transformed into its Structure-Encoded. By performing subsequence matching on the set of sequences in XML documents, all the occurrences of path in the XML documents are refined. Using the index, the numbers of elements retrieved are minimized. The search results with pertinent format provide more structure information without any false dismissals or false alarms. The index also supports keyword search Experiment results indicate the index has significantly efficiency with high precision.
文摘路径查询是XML查询的一个主要特征,现已提出了多种XML索引方法.DTD的结构信息对于XML索引的建立及查询效率的提高很重要,但现有的大部分索引方法没有利用DTD这一有效资源.提出一种利用DTD的XML索引方法--DBXI(DTD-based XML indexing),该方法采用了新的编码方法,可使路径查询具备如下特征:对于由N个元素/属性组成的具有1个谓词约束的路径表达式,DBXI处理每个XML文档仅需0次或1次元素/属性结点集的结构连接操作;对于在XML文档中不存在匹配结构的路径查询,DBXI能够在比现有的XML索引方法较短的时间内给出无查询结果的判断.实验表明,与Lore,SphinX和XISS等索引方法相比,DBXI能够缩短路径查询的响应时间.