1 Introduction Entity resolution (ER), also referred to as record linkage and entity matching [1], is a long-standing challenge, existing in various data management systems, particularly data integration and cleanin...1 Introduction Entity resolution (ER), also referred to as record linkage and entity matching [1], is a long-standing challenge, existing in various data management systems, particularly data integration and cleaning systems. As so often happens, there exist multiple data sources which store duplicate real-world en- tity information in different descriptions, incurred by mis- spellings, typos, diverse name conventions, random usage of the abbreviation or full name, ongoing changes such as in DBpedia, and so forth. The purpose of ER is to determine whether two data records describe the same real-world entity.展开更多
基金We thank Murtadha Ahmed, Yiyi Li, Ping Zhong, YanyanWang, and Jing Su for their invaluable suggestions. This work was supported by the Ministry of Science and Technology of China, National Key Research and Development Program (2016YFB1000703), and the National Natural Science Foundation of China (Grant Nos. 61732014, 61332006, 61472321, 61502390, and 61672432).
文摘1 Introduction Entity resolution (ER), also referred to as record linkage and entity matching [1], is a long-standing challenge, existing in various data management systems, particularly data integration and cleaning systems. As so often happens, there exist multiple data sources which store duplicate real-world en- tity information in different descriptions, incurred by mis- spellings, typos, diverse name conventions, random usage of the abbreviation or full name, ongoing changes such as in DBpedia, and so forth. The purpose of ER is to determine whether two data records describe the same real-world entity.