期刊文献+

一种有效的贪婪模式匹配算法 被引量:5

An Efficient Greedy Algorithm for Schema Matching
下载PDF
导出
摘要 模式匹配问题是意图获得两个模式中所包含个体对象之间的语义匹配和映射,其结果表示源模式的个体对象与目标模式的个体对象之间存在特定的语义关联.它在数据库应用领域起到关键性的作用,例如数据集成、电子商务、数据仓库、XML消息交换等,特别地,它已成为元数据管理的基本问题.然而,模式匹配很大程度上依赖人工的操作,是一个费时费力的过程.模式匹配问题可以归约为一个组合优化问题:多标记图匹配问题.首先,将模式表示为多标记图,将模式匹配转换为多标记图匹配问题.其次,提出多标记图的相似性度量方法,进而提出基于多标记图相似性的模式匹配目标优化函数.最后,在这个目标函数基础上设计实现了一个贪婪匹配算法,其最显著的特点是综合多种可用的标记信息,灵活准确地获得最优的匹配结果. Schema matching is the task of finding semantic correspondences between elements of two schemas, which plays a key role in many database applications, such as data integration, electronic commerce, data warehouse, semantic query processing, and XML message exchange, etc. Especially, it is a basic research issue in metadata management. Unfortunately, it still remains largely a manual, labor- intensive, and expensive process. In this paper, the schema matching problem is treated as a combinatorial problem. Firstly, schemas are transformed into multi-labeled graphs, which are the internal schema model for schema matching. Therefore, the schema matching problem is reduced to the labeled graph matching problem. Secondly, a generic graph similarity measure is discussed, which uses the labels of nodes and the edges to compute the similarity between the two schemas. Then, an objective function based on the multi- labeled graph similarity is proposed. Based on the objective function, a greedy matching algorithm is designed to find the desired matching state for schema matching. A prominent characteristic of this method is that the algorithm combines the feasible matching information to obtain optimal matching. Finally, some schema samples are used to test the greedy matching algorithm. The test results confirm that the algorithm is effective, which can obtain mapping results with high quality.
作者 张治 施鹏飞
出处 《计算机研究与发展》 EI CSCD 北大核心 2007年第11期1903-1911,共9页 Journal of Computer Research and Development
基金 国家"九七三"重点基础研究发展规划基金项目(G1998030408)
关键词 模式 模式匹配 多标记图 标记图匹配 标记图相似性 schema schema matching multi-labeled graph labeled graph matching labeled graphsimilarity
  • 相关文献

参考文献16

  • 1E Rahm,P A Bernstein.A survey of approaches to automatic schema matching[J].The VLDB Journal,2001,10(4):334-350
  • 2P Bernstein.Generic model management:A database infrastructure for schema manipulation[G].In:CoopIS 2001,LNCS 2172.Berlin:Springer,2001
  • 3J Berlin,A Motro.Autoplex:Automated discovery of content for virtual databases[G].In:CoopIS 2001,LNCS 2172.Berlin:Springer,2001
  • 4A Doan,P Domingos,A Halevy.Learning to match the schemas of data sources:A multistrategy approach[J].Machine Learning,2003,50(3):279-301
  • 5W Li,C Clifton.SEMINT:A tool for identifying attribute correspondences in heterogeneous databases using neural network[J].Data and Knowledge Engineering,2000,33 (1):49 -84
  • 6F Giunchiglia,P Shvaiko,M Yatskevich.S-Match:An algorithm and an implementation of semantic matching[C].In:Proc of the ESWS'04.Berlin:Springer,2004
  • 7S Melnik.Generic model management-Concepts and algorithms[G].In:LNCS 2967.Berlin:Springer,2004
  • 8J Madhavan,P A Bernstein,E Rahm.Generic schema matching with Cupid[C].In:Proc of the 27th VLDB Conf.San Francisco:Margan Kaufmann,2001
  • 9张治,车皓阳,施鹏飞.模式匹配问题的描述框架与算法模型[J].模式识别与人工智能,2006,19(6):715-721. 被引量:7
  • 10P Hell.Algorithmic aspects of graph homomorphisms[OL].http://www.cs.sfu.ca/- pavol/,2005-11

二级参考文献12

  • 1Rahm E, Bernstein P A. A Survey of Approaches to Automatic Schema Matching. The International Journal on Very Large Data Bases, 2001, 10(4): 334-350
  • 2Batini C, Lenzerini M, Navathe S B. A Comparative Analysis of Methodologies for Database Schema Integration. ACM Computing Surveys, 1986, 18(4): 323-364
  • 3Do H H, Melnik S, Rahm E. Comparison of Schema Matching Evaluations//Chaudhri A B, Jeckle M, Rahm E, et al, eds. Lecture Notes in Computer Science. London, UK: Springer- Verlag, 2002, 2593:221-237
  • 4Do H H, Rahm E. COMA-A System for Flexible Combination of Schema Matching Approaches // Proc of the 28th International Conference on Very Large Databases. Hongkong, China, 2002: 610-621
  • 5Bernstein P A, Generic Model Management: A Database Infrastructure for Schema Manipulation // Batini C, Giunchiglia F, Giorgini P, eds, Lecture Notes in Computer Science, Heidelberg, Germany: Springer, 2001, 2172: 1-6
  • 6Doan A, Domingos P, Halevy A. Learning to Match the Schemata of Data Sources: A Multistrategy Approach. Machine Learning, 2003, 50(3): 279-301
  • 7Hell P. Algorithmic Aspects of Graph Homomorphisms // Wensley C D, ed. London Mathematical Society Lecture Note Series. Cambridge, UK: Cambridge University Press, 2003: 239-276
  • 8Federa T, Madelaineb F, Stewartc I A, Dichotomies for Classes of Homomorphism Problems Involving Unary Functions. Theoretical Computer Science, 2004, 314(1): 1-43
  • 9Burris S N, Sankappanavar H P. A Course in Universal Algebra (Graduate Texts in Mathematics). New York, USA: Springer- Verlag, 1981
  • 10Dubhashi D P. Complexity of Logical Theories [EB/OL]. [1995-06-01]. http://www. brits. dk/LS/95/BRICS-LS-95. bib

共引文献6

同被引文献57

引证文献5

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部