期刊文献+

基于合作作者与隶属机构信息的同名排歧方法 被引量:6

Co-author and Affiliate Based Name Disambiguation Approach
下载PDF
导出
摘要 同名排歧是实体分辨领域的重要研究内容之一,其旨在分辨出相同姓名对应的不同人。针对传统同名排歧方法需要丰富的信息以及无法解决信息缺乏时的排歧问题,提出了一种基于合作作者和隶属机构信息的同名排歧方法。根据作者间的合作关系以及作者与机构间的隶属关系构造实体关系图,采用广度优先搜索策略搜索图中两两同名作者间的有效路径;根据有效路径长度、数目及路径上边的类型,计算两个同名作者间的连接强度,并将其与阈值进行比较,实现同名排歧。实验结果表明,所提方法比当前最好的方法具有更好的同名排歧效果,且能够实现单一作者的同名排歧。 Name disambiguation is one of the most challenging issues in entity resolution domain,and it aims at solving the problem that the same name is shared by different people.However,most of the conventional approaches rely heavily on sufficient information of entities,and fail to realize the name identification with insufficient information.This paper proposesd a novel name disambiguation approach based on co-authors and authors’affiliates.Specifically,entity relationship diagram is constructed based on co-authorship and authors’affiliates,and the breadth-first search scheme is utilized to search the effective path between each pair of authors with the exactly same name in the constructed entity relationship diagram.A unique metric connection strength between authors is calculated according to the length of effective path,the number of effective path and the type of edge on path.And it is compared with the threshold to achieve name disambiguation.Experimental results show that the proposed approach is better than the state-of-the-art approaches,and it is able to disambiguate the authors sharing the same name without co-authorship.
作者 尚玉玲 曹建军 李红梅 郑奇斌 SHANG Yu-ling;CAO Jian-jun;LI Hong-mei;ZHENG Qi-bin(College of Command Information Systems,PLA University of Science and Technology,Nanjing 210007,China;The 63rd Research Institute,National University of Defense Technology,Nanjing 210007,China)
出处 《计算机科学》 CSCD 北大核心 2018年第11期220-225,260,共7页 Computer Science
基金 国家自然科学基金(61371196) 中国博士后科学基金(2015M582832)资助
关键词 数据质量 实体分辨 同名排歧 有效路径 连接强度 Data quality Entity resolution Name disambiguation Effective path Connection strength
  • 相关文献

参考文献3

二级参考文献10

  • 1Wang Houfeng(王厚峰),Mei Zheng.Chinese multi-document personal name disambiguation[J].High Technology Letters,2005,11(3):280-283. 被引量:8
  • 2蒲旭,王建勇,范晓明.GHOST:作者名字排歧系统[J].计算机研究与发展,2010,47:512—515.
  • 3Han H, Giles L, Zha H, et al. Two supewised learning approaches for name disambiguation in author citations E C ]//Proceedings of ACM/IEEE Joint Conference on Digital Libraries, Tuscon, AZ, USA,2004 : 296 - 305.
  • 4Han H,Zha H,Giles C L. Name disambiguation in au- thor citations using a K-way spectral clustering method [ C]//Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries( JCDL'05 ), New York, NY, USA, 2005, ACM : 334 - 343.
  • 5Song Y, Huang J, Councill I G, et al. Efficient topic- based unsupervised name disambiguation [ C ]//Pro- ceedings of ACM/IEEE Joint Conference on Digital Libraries, Vancouver, British Columbia Canada, 2007 : 342 - 351.
  • 6Kang I S, Na S H, Lee S, et al. On co-authorship for author disambiguation [ J ]. Information Processing & Management,2009,45 ( 1 ) :84 - 97.
  • 7Tan Y F, Kan M Y, Lee D. Search engine driven author disambiguation [ C ]//proceedings of ACM/IEEE Joint Conference on Digital Libraries,2006:314 -315.
  • 8Fan X, Wang J, Lv B, et al. GHOST: an effective graph-based framework for name distinction [ C J//Pro- ceeding of the 17th ACM conference on Information and knowledge management, 2008, ACM: 1449 - 1450.
  • 9Salton G, Buckley C. Term-weighting approaches in automatic text retrieval [ J ]. Information Processing & Mana,ement. 1988.24 ( 5),513 - 523.
  • 10郎君,秦兵,宋巍,刘龙,刘挺,李生.基于社会网络的人名检索结果重名消解[J].计算机学报,2009,32(7):1365-1374. 被引量:32

共引文献3

同被引文献48

引证文献6

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部