摘要
由于在线信息变化频繁 ,XML文档变化快速检测成为Internet查询系统、搜索引擎以及连续查询系统的关键技术 目前国际上的研究主要集中于有序模式的XML文档比较 ,针对有序模式最好的算法复杂度为O(nlogn) ,其中n为文档的长度 ,而针对无序模式为多项式时间复杂度 为提高处理效率 ,提出一种基于特征路径的变化检测算法 ,将传统标号树匹配问题转换为基于特征路径的无重复路径标号树的匹配问题 ,同时适于有序和无序两种模式 ,复杂度为O(n) ,其中n为文档结点的个数 实验证明KF Diff
Since online information changes frequently, being able to quickly detect changes in XML documents is important to Internet query systems, search engines, and continuous query systems Most previous work in change detection on XML or other hierarchically structured data uses the ordered tree model, with the best complexity of O(n log n ), where n is the size of the document The best algorithm for unordered mode achieves polynomial time in complexity In this paper, a Key path based change detection algorithm named KF Diff+is represented, which transforms the traditional tree to tree correction into the comparing of the Key trees which are substantially label trees without duplicate paths, with the complexity of O(n) , where n is the total number of nodes in the documents In addition, KF Diff+is tailored to both ordered trees and unordered trees, which is also a significant difference compared with the previous work Experiments show that KF Diff+can handle XML documents at extreme speed
出处
《计算机研究与发展》
EI
CSCD
北大核心
2003年第9期1375-1381,共7页
Journal of Computer Research and Development
基金
国家"八六三"高技术研究发展计划基金项目 ( 2 0 0 2AA1160 40 )