摘要
构件聚类时,提出了一种计算基于XML描述的构件间相似度的递归算法,能有效度量构件XML描述文档包含的结构和语义信息。构造文档相似矩阵,利用遗传算法将高维样本映射到二维平面上,使用k-means算法聚类,获得全局最优的构件聚类。最后,在构件库测试模型上进行实验,实验结果表明,基于XML相似度的构件聚类算法在构件查询实践中具有可行性和有效性。
During components clustering, a recursive approach is presented to measure the similarity between two components described by XML document. It can effectively measure the similarity of XML documents contains the description of structural and semantic information. Similar matrix of XML documents, genetic algorithm mapped high-dimensional to two-dimensional, k-means clustering algorithm, access to the global optimal clustering components. The results of experiment on a testing system of component repository confirm the feasibility and efficiency of clustering components based on XML documents similarity in component retrieval.
出处
《计算机工程与设计》
CSCD
北大核心
2009年第2期507-510,共4页
Computer Engineering and Design
关键词
XML
构件
语义相似度
遗传算法
聚类
XML
component
semantic similarity
genetic algorithm
cluster