

DML-Diff:content-based change detection algorithm for XML documents
摘要 目前国际上对变化检测算法的研究主要集中于在效率或空间上的优化,变化检测的精确程度不能令人满意,比如不能准确定位改变的文字等。通过将XML文档的树型结构和文本之间相似度相结合,提出了一种新颖的面向文本内容的变化检测算法DML-Diff,重点突出了文本内容的变化,使得变化检测结果更精确。 Most previous researches on change detection algorithm are focused on the optimization of efficiency or space, however, the accuracy is not satisfactory. For example, when part of a sentence changes, they believe that the whole sentence has been changed, instead of positioning the changed words in this sentence. This paper proposed a novel content-based change detection algorithm named DML-Diff, which combined the tree structure of XML documents with the computing of texts similarity, making the results more accurate.
出处 《计算机应用研究》 CSCD 北大核心 2012年第8期3000-3003,共4页 Application Research of Computers
基金 中国科学院信息化专项基金资助项目(INFO-115-D01) 中国科学院知识创新工程重大项目(KGCX1-YW-13)
关键词 可扩展标记语言 变化检测 面向文本内容 版本控制 XML change detection content-based version control
  • 相关文献


  • 1COBENA G, ABITEBOUL S, MARIAN A. Detecting changes in XML documents[ C ]//Proc of the 18th International Conference on Data Engineering. Washington DC : IEEE Computer Society, 2002:41 - 52.
  • 2WANG Yuan, De WITY D J, CAI Jin-yi. X-Diff: an effective change detection algorithm for XML documents [ C ]//Proc of the 19th Inter- national Conference on Data Engineering. 2003:519-530.
  • 3ZHANG K, SHASHA D. Simple fast algorithms for the editing dis- tanee between trees and related problems [ J ]. SlAM Journal of Computing, 1989,18 (6) : 1245-1262.
  • 4RONNAU S,PHILIPP G, BORGHOFF U M. Efficient change control of XML documents[ C]//Proc of the 9th ACM Symposium on Docu- ment Engineering. New York : ACM Press,2009 : 3-12.
  • 5BERGROTH L, HAKONEN H, RAITA T. A survey of longest common subsequence algorithms [ C ]//Proc of the 7th International Symposium on String Processing Information Retrieval. Washington DC:IEEE Computer Society,2000 : 39-48.
  • 6陈振洲,李磊.GA-Diff:一种快速XML文档变化检测算法[J].计算机工程与应用,2004,40(18):186-188. 被引量:2


  • 1H Maruyama,K Tamura,R Uramoto.Digest values for DOM(DOMHash)proposal.IBM Tokyo Research Laboratory,http://www.trl.ibm.co.jp/projects /xml/domhash.htm,1998
  • 2Zhang,D Shasha.Simple Fast Algorithms for the Editing Distancebetween Trees and Related Problems[J].SIAM Journal of Computing,1989; 18(6):1245-1262
  • 3G Cobena,S Abiteboul,A Marian.Detecting Changes in XML Documents[C].In:The 18th International Conference on Data Engineering,San Jose,2002-02
  • 4Y Wang et al.X-Diff:An Effective Change Detection Algorithm for XML Documents.2002
  • 5Andrew Nierman,H V Jagadish.Evaluating Structural Similarity in XML Documents.2002
  • 6Extensible Markup Language(XML).World Wide Web Consortium,http://www.w3.org/XML/
  • 7Concurrent Versions System (CVS).Free Software Foundation,http://www.gnu.org/manual/cvs-1.9
  • 8Chawathe S et al.Change Detection in Hierarchically Structured Information[C].In:Proc of the ACM SIGMOD Intl Conf on Management of Data,Montreal,1996-06
  • 9F Douglis,T Ball,Y F Chen et al.The AT&T Intemet Difference Engine:Tracking and Viewing Changes on the Web[J].World Wide Web,1998; 1(1 )27-44
  • 10E Berk.HtmlDiff:A Differencing Tool for HTML Documents.Student Project,Princeton University,http://www.htmldiff.com









使用帮助 返回顶部