摘要
针对数据空间中大量异质数据没有统一的语义,无法进行基于属性值相似度的实体解析任务的问题,提出了从实体间关系进行实体解析的简单方法。通过决策结点和决策关系构建连接图,并通过连通分量算法进行冗余结点的删除和属性的继承。通过构建的小规模数据集进行了算法的验证。
Aiming at the problem that a large number of heterogeneous data in the dataspace do not have uniform semantics,entity resolution task based on the similarity of attribute values cannot be carried out.In this paper,a simple method for entity reso-lution from inter-entity relations is proposed to construct connection graph through decision nodes and decision relations,and de-lete redundant nodes and inherit attributes through connected component algorithm.In this paper,the algorithm is verified by con-structing a small-scale data set.
作者
祁祥威
Qi Xiangwei(Sichuan Key Laboratory of Manufacturing Industry Chain Collaboration and Information Support Technology,Southwest Jiaotong University,Chengdu 611756,China)
出处
《现代计算机》
2023年第15期80-82,共3页
Modern Computer
关键词
实体解析
数据空间
实体关系模型
数据清洗
entity resolution
dataspace
entity-relationship model
data cleaning