期刊文献+

基于多维相似度的整体式实体统一算法研究 被引量:1

Research on Holistic Entity Resolution Algorithm Based on Multidimensional Similarity
下载PDF
导出
摘要 多源异构的数据在融合的过程中,不可避免的会呈现多个实体表象指向同一个实体的现象。传统的实体统一算法多采用两两匹配、传递闭包的方式,只考虑了表象的属性相似度,在一次比较后就需要做出匹配决定,在当下属性值普遍缺失、表象间存在关联的大环境下有些不太适用,因此提出了一种基于多维相似度的整体式实体统一算法。本算法采用一种基于图的迭代聚类的整体式实体统一算法,实体统一的过程是各匹配对相互影响、循环往复不断迭代的整体式的过程;在匹配的过程中,综合使用了属性、“上下文”、“关系”等信息来进行了相似度的度量,进一步提高了实体统一的准确度;通过在多个数据集上进行对比实验,验证算法在实体统一方面的性能优势。 In the process of the multi-source heterogeneous data fusion,it was inevitable that multiple entity representations was pointed to the same entity.The method of pairwise matching and transitive closure and only considering the attribute similarity of the representation were adopted in the traditional entity unified algorithm mostly.The matching decision needs to be made after a comparison.The current attribute value was generally missing;and it was not suitable at large correlation environment.In this paper,a unified entity unified algorithm based on multidimensional similarity was provided.A graph-based iterative clustering algorithm was adopted.The entity unified process was a monolithic process in which each pair interacts and reciprocates continuously.In the process of matching,the attributes were comprehensively used.Information such as“context”and“relationship”were used to measure the similarity,which further improves the accuracy of entity unification.By comparing experiments on multiple data sets,the performance advantages of the algorithm in entity unification were verified.
作者 范威振 陈占芳 刘燕龙 FAN Weizhen;CHEN Zhanfang;LIU Yanlong(School of Computer Science and Technology,Changchun University of Science and Technology,Changchun 130022)
出处 《长春理工大学学报(自然科学版)》 2019年第4期114-119,共6页 Journal of Changchun University of Science and Technology(Natural Science Edition)
关键词 多维相似度 相似团 实体统一 迭代聚类 multidimensional similarity quasi-clique entity resolution iterative clustering
  • 相关文献

参考文献8

二级参考文献48

共引文献31

同被引文献18

引证文献1

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部