期刊文献+

一种元路径下基于频繁模式的实体集扩展方法 被引量:8

Method of Entity Set Expansion Based on Frequent Pattern Under Meta Path
下载PDF
导出
摘要 实体集扩展是指已知某个特定类别的几个种子实体,根据一定的规则得到该类别的更多实体.作为一种经典的数据挖掘任务,实体集扩展已经有很多的应用,诸如字典建立、查询建议等.现有的实体集扩展主要是基于文本或网页信息,即实体之间的关系从其在文本或者网页中的共现来推断.随着知识图谱研究的兴起,根据知识图谱中知识的共现来研究实体集扩展也成为了一种可能.主要研究知识图谱中的实体集扩展问题,即:给定几个种子实体,利用知识图谱来得到更多的同类别的实体.首先,把知识图谱建模成一个异质信息网络,即含有多种实体类型或者关系类型的网络,提出了一种新的元路径下基于频繁模式的实体集扩展方法,称为FPMP_ESE.FPMP_ESE采用异质信息网络中的元路径来捕捉种子实体之间的潜在共同特征.为了找到种子实体之间重要的元路径,设计了一种新的基于频繁模式的元路径自动产生算法FPMPG.之后,为了更好地给每条元路径分配相应的权重,设计了启发式的方法和PUlearning的方法.最后,在真实数据集Yago上的实验结果表明,所提出方法较其他方法在实体集扩展任务上具有更好的性能和更高的效率. Entity set expansion (ESE) refers to getting a more complete set according to some rules, given several seed entities with specific semantic meaning. As a popular data mining task, ESE has many applications, such as dictionary construction and query suggestion. Contemporary ESE mainly utilizes text or Web information. That is, the intrinsic relations among entities are inferred from theirco-occurrences in text or Web. With the surge of knowledge graph in recent years, it is possible to extend entities according to their co-occurrences in knowledge graph. This paper studies the problem of the entity set expansion in knowledge graph. That is, given several seed entities, how to obtain more entities by leveraging knowledge graph. Firstly, the knowledge graph is modeled as a heterogeneous information network (HIN), which contains multiple types of entities or relationships. Next, a novel method of entity set expansion based on frequent pattern under Meta path, called FPMP ESE, is proposed, FPMP_ESE employs Meta paths to capture the implicit common traits of seed entities. In order to find the important Meta paths between entities, an automatic Meta path generation method is designed based on frequent pattern called FPMPG. Then, two kinds of heuristic and PU learning methods are developed to distribute the weights of Meta paths. Finally, experiments on real dataset Yago demonstrate that the proposed method has better effectiveness and higher efficiency compared to other methods.
作者 郑玉艳 田莹 石川 ZHENG Yu-Yan;TIAN Ying;SHI Chuan(School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China)
出处 《软件学报》 EI CSCD 北大核心 2018年第10期2915-2930,共16页 Journal of Software
基金 国家重点研究和发展计划(973)(2017YFB0803304) 国家自然科学基金(61772082 61375058) 北京市自然科学基金(4182043)~~
关键词 知识图谱 实体集扩展 异质信息网络 元路径 频繁模式 PU LEARNING knowledge graph entity set expansion heterogeneous information network meta path frequent pattern PU learning
  • 相关文献

参考文献1

二级参考文献13

  • 1Vishnu Vyas, Patrick Pantel, Eric Crestan. Helping editors choose better seed sets for entity set[C]//Proceedings of CIKM 2009. Hong Kong: ACM, 2009:225-234.
  • 2Richard C Wang, Nico Schlaefer, William W Cohen et al. Automatic Set Expansion for List Question Answering[C]//Proceedings of EMNLP 2008. USA: ACL, 2008: 947-954.
  • 3Richard C Wang, William W Cohen. Automatic Set Instance Extraction Using the Web [C]//Proceedings of ACL/AFNLP 2009. Singapre: ACL, 2009: 441- 449.
  • 4Luis Sarmento, Valentiin Jijkoun. " More Like These": Growing Entity Classes from Seeds [C]// Proceedings of CIKM 2007. Portugal: ACM, 2007:959-962.
  • 5Pasca. Weakly-supervised discovery of named entities using web search queries [C]//Proceedings of CIKM 2007. Portugal: ACM, 2007: 683-690.
  • 6Richard C Wang, William W Cohen. Language-Independent Set Expansion of Named Entities Using the Web[C]//Proceedings of ICDM 2007. USA: IEEE Computer Society, 2007: 342-350.
  • 7Richard C Wang, William W Cohen. Iterative set expansion of named entities Using the web[C]//Proceedings of ICDM 2008. Italy:IEEE Computer Society,2008: 1091-1096.
  • 8Patrick Pantel, Eric Crestan, Arkady Borkovsky, et al. Web-Scale Distributional Similarity and Entity Set Expansion[C]//Proceedings of EMNLP2009. Singapore: ACL, 2009: 938-947.
  • 9Benjamin Van Durme, Marius Pasca. Finding Cars, Goddesses and Enzymes Parametrizable Acquisition of Labeled[C]//Proceedings of AAAI08. USA: AAAI Press 2008: 1243-1248.
  • 10Yeye He, Dong Xin. SEISA Set Expansion by Iterative Similarity Aggregation [C ]//Proceedings of WWW 2011. India: ACM, 2011:427-436.

共引文献3

同被引文献38

引证文献8

二级引证文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部