期刊文献+

基于属性信息熵的实体匹配方法研究 被引量:5

Methodology for Entities Matching Based on Attribute Information Entropy
下载PDF
导出
摘要 如何找出异构数据库间相同的实体,特别是当现实生活中的同一实体在不同的应用环境中用不同的标识符表示时,如何根据已知描述实体的相同属性的信息,进行实体匹配,解决实体异构问题,是实现数据库间互操作至关重要的因素。针对该问题,文章给出了一种基于属性信息熵的实体匹配方法。具体数据的实验结果显示该方法是很有效的。 One main problem encountered constantly in heterogeneous databases is to identify corresponding entities, which arises when the same real-word entity type is represented using different identifiers in different applications. In order to make interoperability in multiple heterogeneous databases, identifying the heterogeneous entities and resolving entity heterogeneity are critical. The paper proposes an approach for entities matching based on attribute information entropy. The experimental results on real-world data show the proposed approach is very effective.
出处 《计算机工程》 EI CAS CSCD 北大核心 2005年第21期31-33,共3页 Computer Engineering
基金 国家自然科学基金资助项目(60073047)
关键词 实体匹配 属性信息熵 实体异构 异构数据库 Entity matching Attribute information entropy Entity heterogeneity: Heterogeneous databases
  • 相关文献

参考文献6

  • 1Qiang Baohua, Wu Kaigui, Wu Zhongfu. A Data-type-based Approach for Identifying Corresponding Attributes in Heterogeneous Databases. Xi'an, China: In: Proceedings of 2003 International Conference on Machine Learning and Cybernetics, 2003-11.
  • 2Qiang Baohua, Wu Kaigui, Liao Xiaofeng. Similarity Determination on Data Types in Heterogeneous Databases Using Neural Networks. Nanjing, China: In: Proceedings of 2003 International Conference on Neural Networks and Signal Processing, 2003-12.
  • 3Copas J B, Hilton F J, Record Linkage: Statistical Models for Matching Computer Records. J. Royal Statistical Soc.,1990, 153(3):287-320.
  • 4Dey D, Sarkar S, De P. A Probabilistic Decision Model for Entity Matching in Heterogeneous Databases. Management Science, 1998,12(10): 1379-1395.
  • 5Dey D, Sarkar S, De R A Distance-based Approach to Entity Reconciliation in Heterogeneous Databases. IEEE Transaction on Knowledge and Data Engineering, 2002, 14(3).
  • 6Barron F H, Barrett B E. Decision Quality Using Ranked Attribute Weights. Management Science, 1996, 42( 11 ): 1515-1523.

同被引文献41

  • 1强保华,吴中福,陈凌,吴开贵,余建桥.异构数据库环境下语义集成过程的并行计算方法研究[J].计算机科学,2004,31(9):96-99. 被引量:1
  • 2赵江洪 ,杜秀荣 .GIS中多幅图自动接边功能的算法实现[J].测绘通报,2005(9):32-34. 被引量:9
  • 3赵江洪.GIS中多图幅自动接边的实现方法探讨[J].测绘通报,2006(2):50-52. 被引量:27
  • 4王煜,王正欧,白石.用于文本分类的改进KNN算法[J].中文信息学报,2007,21(3):76-82. 被引量:15
  • 5Hayne S,Ram Su, Multi user view integration system(MUVIS):An expert system for view integration. In: Proe. in the 6th Intl. Conf. on Data Engineering, 1990. 402-409.
  • 6Sahon G, Yang C S, Yu C T. A theory of term importance in automatic text analysis. Journal of the American Society for Information Science, 1975,26 ( 1 ) : 33 - 44.
  • 7Benkley S S, Fandozzi J E, Housman E M, et al. Data element tool-based analysis ( DELTA ) : [ Technical Report MTR95B0000147]. The MITRE Corporation, Bedford, MA, 1995.
  • 8Li W-S, Clifton C, Liu SY. Database integration using neural networks: implementation and experiences. Knowledge and Information Systems, Springer-Verlag London Ltd, 2000,2 : 73-96.
  • 9Li W S, Clifton C. Semantic integration in heterogeneous databases using neural networks. In: Proe. of the 20th VLDB Conf. Santiago, Chile, 1994.
  • 10Premerlani W J, Blaha M R. An approach for reverse engineering of relational databases. Communications of the ACM, 1994,37(5):42-49.

引证文献5

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部