期刊文献+

基于XML文档相似性的构件聚类分析 被引量:7

Analysis of clustering components based on XML documents similarity
下载PDF
导出
摘要 构件聚类时,提出了一种计算基于XML描述的构件间相似度的递归算法,能有效度量构件XML描述文档包含的结构和语义信息。构造文档相似矩阵,利用遗传算法将高维样本映射到二维平面上,使用k-means算法聚类,获得全局最优的构件聚类。最后,在构件库测试模型上进行实验,实验结果表明,基于XML相似度的构件聚类算法在构件查询实践中具有可行性和有效性。 During components clustering, a recursive approach is presented to measure the similarity between two components described by XML document. It can effectively measure the similarity of XML documents contains the description of structural and semantic information. Similar matrix of XML documents, genetic algorithm mapped high-dimensional to two-dimensional, k-means clustering algorithm, access to the global optimal clustering components. The results of experiment on a testing system of component repository confirm the feasibility and efficiency of clustering components based on XML documents similarity in component retrieval.
出处 《计算机工程与设计》 CSCD 北大核心 2009年第2期507-510,共4页 Computer Engineering and Design
关键词 XML 构件 语义相似度 遗传算法 聚类 XML component semantic similarity genetic algorithm cluster
  • 相关文献

参考文献6

  • 1王渊峰,薛云皎,张涌,朱三元,钱乐秋.刻面分类构件的匹配模型[J].软件学报,2003,14(3):401-408. 被引量:47
  • 2徐如志,钱乐秋,程建平,王渊峰,朱三元.基于XML的软件构件查询匹配算法研究[J].软件学报,2003,14(7):1195-1202. 被引量:44
  • 3Heather Williamson.XML:The complete reference[M].北京:机械工业出版社,2002.
  • 4Pandya A, Bhattacharyya P. Text similarity measurement using concept representation of texts[C].Proceedings of First International Conference on Pattern Recognition and Machine Intelligence.Berlin,Germany:Springer,2005.
  • 5张丙奇,白硕,赵章界.XML数据相似度研究[J].计算机工程,2005,31(11):25-27. 被引量:6
  • 6Han Jiawei,Micheline Kamber.数据挖掘与技术[M].范明,孟小峰,译.北京:机械工业出版社,2001.

二级参考文献19

  • 1Ivar J. Software reuse: Architecture, process and organization for business success. Reading: Addison-Wesley Publishing Company,1997.4~15.
  • 2Mill H, Mili A. Reuse based software engineering. New York: John Wiley & Sons Inc., 2002. 444-459.
  • 3Frakes WB, Pole TP. An empirical study of representation methods for reusable software components. IEEE Transactions on Software Engineering, 1994,120(8):617~630.
  • 4Gibb F, McCartan C, O'Donnell R, Sweeney N, Leon R. The integration of information retrieval techniques within a software reuse environment. Journal of Information Science, 2000,26(4):520--539.
  • 5Torshen S. ApproXQL: Design and implementation of an approximate pattern matching language for XML. Technical Report, B 01-02, Freie University at Berlin, 2001.
  • 6Thorsten R. A new measure of the distance between ordered trees and its applications. Research Report, 85166, Department of Computer Science, University of Bonn, 1997.
  • 7Torshen S, Naumann F. Approximate tree embedding for querying XML data. In: Proceedings of ACM SIGIR Workshop on XML and Information Retrieval. Athens, 2000.
  • 8Zhang KZ. On the editing distance between unordered labeled trees. Information Processing Letters, 1992,42(3):133~139.
  • 9Wang YF. Research on retrieving reusable components classified in faceted scheme [Ph.D. Thesis]. Shanghai: Fudan University,2002 (in Chinese with English abstract).
  • 10Chang JC, Li KQ, Ouo LF, Mei H, Yang FQ. Representing and retrieving reusable software components in JB (Jadebird) system.Electronic Journal, 2000,28(8):20-24 (in Chinese with English abstract).

共引文献82

同被引文献68

引证文献7

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部