期刊文献+

基于属性关系图的同名实体区分算法 被引量:1

An Algorithm Based on Attributed Relational Graphs for Name Disambiguation
下载PDF
导出
摘要 同名问题在大规模的数据库或者数字化图书馆中普遍存在,且困扰着许多研究课题。本文首先提出一种新的图结构——属性关系图(ARG)形象地刻画实体特征及实体间的联系,并给出一种基于属性关系图框架的同名区分算法ARG-Resolution,对共享同一名字的作者进行分析,根据他们之间的相似度将其聚类,最终得到对应真正实体的各个结果聚类。实验证明挖掘作者间的潜在连接进一步提高了同名区分的质量,成功解决了同名问题。 The problem of name sharing is widespread in large-scale databases or digital libraries,and it causes many research troubles. We propose a graph module named Attributed Relational Graph to describe the figures and links between entities,then we apply an algorithm named ARG-Resolution based on Attributed Relational Graph to distinct the entities having the same name. The algorithm analyzes the entities and clusters them according to the similarity measure,and eventually gets a set of clusters that correspond to the real entity respectively. The experiment over real datasets shows that mining the links can improve the quality of name disambiguation and resolve the problem successfully.
出处 《计算机工程与科学》 CSCD 北大核心 2010年第9期61-64,共4页 Computer Engineering & Science
基金 国家自然科学基金资助项目(60673136)
关键词 同名 属性 链接 相似性 层次聚类 name sharing attributes links similarity hierarchical clustering
  • 相关文献

参考文献11

  • 1http://www.cdblp.cn/.
  • 2Malin B.Unsupervised Name Disambiguation via Social Network Similarity[C] ∥Proc of SIAM Workshop on Link Analsis,Counterterrorism,and Security,2005:93-102.
  • 3Bhattacharya I,Getoor L.Iterative Record Linkage for Cleaning and Integration[C] ∥Proc of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery,2004:11-18.
  • 4Lee D,On B W,Kang J,et al.Effective and Scalable Solutions for Mixed and Split Citation Problems in Digital Libraries[C] ∥Proc of the 2nd Int'l Workshop on Information Quality in Informational Systems,2005:69-76.
  • 5Shu Liangcai,Long Bo,Meng Weiyi.A Latent Topic Model for Complete Entity Resolution[C] ∥Proc of the 2009 IEEE Int'l Conf on Data Engineering,2009:880-891.
  • 6Burges C J C.A Tutorial on Support Vector Machines for Pattern Recognition[J].Data Mining and Knowledge Discovery,1998,2(2):121-167.
  • 7Han H,Zha H,Giles L.Name Disambiguation in Author Citations Using a k-way Spectral Clustering Method[C] ∥Proc of JCDL'05,2005:334-343.
  • 8Yu Manquan.Research on Knowledge Mining in Person Tracking:[Ph D Dissertation] [D].Beijing:Institute of Computing Technology,Chinese Academy of Sciences,2006.
  • 9Yin X,Han J,Yu P S.Object Distinction:Distinguishing Objects with Identical Names[C] ∥Proc of the IEEE 23rd Int'l Conf on Data Engineering,2007:1242-1246.
  • 10Kalashnikov D V,Mehrotra S,Chen Zhaoqi.Exploiting Relationships for Domain-Independent Data Cleaning[C] ∥Proc of the SIAM Int'l Conf on Data Mining,2005:1-12.

共引文献7

同被引文献12

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部