摘要
目前国际上对变化检测算法的研究主要集中于在效率或空间上的优化,变化检测的精确程度不能令人满意,比如不能准确定位改变的文字等。通过将XML文档的树型结构和文本之间相似度相结合,提出了一种新颖的面向文本内容的变化检测算法DML-Diff,重点突出了文本内容的变化,使得变化检测结果更精确。
Most previous researches on change detection algorithm are focused on the optimization of efficiency or space, however, the accuracy is not satisfactory. For example, when part of a sentence changes, they believe that the whole sentence has been changed, instead of positioning the changed words in this sentence. This paper proposed a novel content-based change detection algorithm named DML-Diff, which combined the tree structure of XML documents with the computing of texts similarity, making the results more accurate.
出处
《计算机应用研究》
CSCD
北大核心
2012年第8期3000-3003,共4页
Application Research of Computers
基金
中国科学院信息化专项基金资助项目(INFO-115-D01)
中国科学院知识创新工程重大项目(KGCX1-YW-13)