期刊文献+

一种基于函数依赖的属性相似度调整算法 被引量:1

An Attribute Similarity Adjusting Algorithm Based on Functional Dependency
下载PDF
导出
摘要 属性相似度的准确性是影响实体分辨准确程度的重要因素之一.为提高属性相似度的准确性,分析了属性相似度与函数依赖的关系,给出了属性相似度调整原则,提出了依据函数依赖进行相似度划分、相似度传递调整和计算相似度调整代价的方法,提出了通过属性相似度调整提高属性相似度准确性的属性相似度传递调整算法.实验结果表明,该算法能够更好地区分匹配记录对和不匹配记录对,获得更高的查全率、查准率和F1值. The accuracy of attribute similarity is one of the important factors affecting the precision of entity resolution(ER).To improve the accuracy of attribute similarity,the relation between attribute similarity and functional dependency(FD)was analyzed and the principles for attribute similarity adjusting were suggested.The FD based methods for similarity partition,similarity transitively adjusting and cost computing of similarity adjusting were proposed.An algorithm for attribute similarity adjusting with FD(SAWFD)was put forward to improve the accuracy of attribute similarity.The experiment results show that the algorithm can better distinguish matching and unmatching records,and get higher scores of recall,precision and F1 measure.
出处 《上海交通大学学报》 EI CAS CSCD 北大核心 2015年第8期1075-1083,1089,共10页 Journal of Shanghai Jiaotong University
基金 国家自然科学基金项目(61070714) 解放军理工大学预研基金项目(20110604)资助
关键词 实体分辨 属性相似度 函数依赖 entity resolution attribute similarity functional dependencies
  • 相关文献

参考文献18

  • 1Papadakis G, Ioannou E, Nieder6e C, etal. Elimina- ting the redundancy in blocking-based entity resolu- tion methods[C]// Glen Newton. Proceedings of the llth annual international ACM/IEEE joint conference on Digital libraries. Ottawa: ACM, 2011 : 85-94.
  • 2Papadakis G, Ioannou E, Niederae C, et al. Efficient entity resolution for large heterogeneous information spaees[C]// Irwin King. Proceedings of the fourth ACM international conference on Web search and data mining. Hong Kong: ACM, 2011 : 535-544.
  • 3Papadakis G, Ioannou E, Niederae C, etal. To com- pare or not to compare: making entity resolution more effieient[C]// Roberto De Virgilio, Fausto Gion- chiglia, Letizia Tanca. Proceedings of the International Workshop on Semantic Web Information Management. Athens : ACM,2011 : 3.
  • 4Lange D, Naumann F. Efficient similarity search: ar- bitrary similarity measures, arbitrary composition[C] // Bettina Berendt, Arjen de Vries, Wenfei Fan. Pro- ceedings of the 20th ACM international conference on Information and knowledge management. Glasgow: ACM, 2011 : 1679-1688.
  • 5Heath T, Bizet C. Linked data: Evolving the web in- to a global data spaee[J]. Synthesis lectures on the se- mantic web: theory and technology, 2011,1 ( 1 ) : 1-136.
  • 6Paradies M, Malaika S, Simeon J, et al. Entity matching for semistructured data in the Cloud[C]// Sascha Ossowski, Rey Juan Carlos. Proceedings of the 27th Annual ACM Symposium on Applied Computing. Trento : ACM, 2012 : 453-458.
  • 7Snae C. A comparison and analysis of name matching al- gorithms[J]. International Journal of Applied Science, Engineering and Technology, 2007,4(1) : 252-257.
  • 8Wang J, Li G, Yu J X, etal. Entity matching: how similar is similar [J]. Proceedings of the VLDB En- dowment, 2011,4(10) :622-633.
  • 9Christen P. Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection[M]. New York: Springer Science & Busi- ness Media, 2012.
  • 10Naumann F, Herschel M. An introduction to dupli- cate detection[J]. Synthesis Lectures on Data Manage- ment,2010,2(1) :1-87.

二级参考文献5

  • 1Mannila H, Raihi K-J. Algorithms for inferring functional dependencies from relations. Data and Knowledge Engineering, 1994,12(1):83-99.
  • 2Huhtala Y, Kārkkāinen J, Porkka P, Toivonen H. Tane: An efficient algorithm for discovering functional and approximate dependencies. The Computer Journal, 1999,42(2): 100-111
  • 3Savnik I. Bottom-Up induction of functional dependencies from relations. In: Piatetsky-Shapiro G, ed. Proceedings of the AAAI'93 Workshop on Knowledge Discovery in Databases. 1993. 174-185.
  • 4Beeri C, Dowd M, Fagin R, Statman R. On the structure of Armstrong relations for functional dependencies. Journal of the ACM,1984,31(1):30-46.
  • 5Lopes S, Petit J-M, Lakhal L. Efficient discovery of functional dependencies and Armstrong relations. In: Proceedings of the EDBT 2000. LNCS 1777, Heidelberg: Springer-Verlag, 2000. 350--364.

共引文献10

同被引文献4

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部