摘要
可扩展标记语言正在成为Web上各种应用交换信息的标准。对高性能可扩展标记语言数据处理技术的需求正日益增长。现有的大部分可扩展标记语言文档相似度计算方法是基于文档结构特征的。该文提出了一个新的基于综合语义的可扩展标记语言文档相似度计算方法。该方法综合利用了可扩展标记语言文档的结构信息和内容信息,具有一定应用前景。
The extendible markup language(XML) is emerging as a standard for information exchange among various applications on the world-wide web.There has been a growing need for developing high-performance techniques to process XML data efficiently.Most existing methods of XML document similarity computation are based on structural features.This paper proposes a new method for XML document similarity computation based on the synthetical features of XML documents.This method takes use of both the structural informat...
出处
《杭州电子科技大学学报(自然科学版)》
2009年第3期64-67,共4页
Journal of Hangzhou Dianzi University:Natural Sciences