期刊文献+

融合语义解析的知识图谱表示方法 被引量:2

Knowledge Graph Representation Method Combined with Semantic Parsing
下载PDF
导出
摘要 为解决大多数知识图谱表示学习模型仅使用三元组信息的问题,提出融合语义解析的知识图谱表示模型BERT-PKE.模型利用实体和关系的文本描述,通过BERT的双向编码表示进行语义解析,深度挖掘语义信息.由于BERT训练代价昂贵,提出一种基于词频和k近邻的剪枝策略,提炼选择文本描述集.此外,由于负样本的构造影响了模型的训练,提出2种改进随机抽样的策略:一种是基于实体分布的负采样方法,以伯努利分布概率来选择替换的实体,该方法可以减少负采样引起的伪标记问题;另一种是基于实体相似性负采样方法,首先用TransE将实体嵌入到向量空间,使用k-means聚类算法将实体进行分类.通过同簇实体的相互替换可获得高质量的负三元组,有利于实体的特征学习.实验结果表明,所提出BERT-PKE模型与TransE,KG-BERT,RotatE等相比,性能有显著提升. To solve the problem that the knowledge graph representation learning model only uses triples information,a representation model with semantic analysis is proposed,which is named bidirectional encoder representations from transformers-pruning knowledge embedding(BERT-PKE).It employs bidirectional encoder representations to analyze text,and mines the depth semantic information of entities and relations based on the entities and relations of text description.Since BERT has the heavy consumption in the training time,we propose a pruning strategy with word frequency and k-nearest neighbors to extract the selected text description set.In addition,due to the construction of negative samples has impacts on training model,two strategies are introduced for improving random sampling.One is a negative sampling method based on entity distribution,in which the Bernoulli distribution probability is used to select the replaced entities.It reduces the Pseudo-Labelling problem caused by negative sampling.The other is a negative sampling method based on the similarity of the entities.It mainly uses TransE and k-means to represent the entities as the vectors and classify the entities respectively.High-quality negative triples can be obtained by mutual replacement of entities in the same cluster,which is helpful for feature learning of entities.Experimental results show that the performance of proposed model is significantly improved compared to the SOTA baselines.
作者 胡旭阳 王治政 孙媛媛 徐博 林鸿飞 Hu Xuyang;Wang Zhizheng;Sun Yuanyuan;Xu Bo;Lin Hongfei(School of Computer Science and Technology,Dalian University of Technology,Dalian,Liaoning 116024)
出处 《计算机研究与发展》 EI CSCD 北大核心 2022年第12期2878-2888,共11页 Journal of Computer Research and Development
基金 国家重点研发计划项目(2018YFC0830603)。
关键词 知识图谱表示学习 BERT模型 语义解析 负采样 剪枝 knowledge graph representation learning BERT semantic analysis negative sampling pruning
  • 相关文献

参考文献3

二级参考文献83

  • 1Miller G A. WordNet: A lexical database for English [J]. Communications of the ACM, 1995, 38(11): 39-41.
  • 2Bollacker K, Evans C, Paritosh P, et al. Freebase: A collaboratively created graph database for structuring human knowledge [C] //Proe of KDD. New York: ACM, 2008: 1247-1250.
  • 3Miller E. An introduction to the resource description framework [J]. Bulletin of the American Society for Information Science and Technology, 1998, 25(1): 15-19.
  • 4Bengio Y. Learning deep architectures for AI [J]. Foundations and Trends in Machine Learning, 2099, 2 (1) 1-127.
  • 5Bengio Y, Courville A, Vincent P. Representation learning: A review and new perspectives [J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1798-1828.
  • 6Turian J, Ratinov L, Bengio Y. Word representations: A simple and general method for semi-supervised learning [C]// Proc of ACL. Stroudsburg, PA: ACL, 2010:384-394.
  • 7Manning C D, Raghavan P, Schutze H. Introduction to Information Retrieval [M]. Cambridge, UK: Cambridge University Press, 2008.
  • 8Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their eompositionality [C] //Proe of NIPS. Cambridge, MA: MIT Press, 2013:3111-3119.
  • 9Zhao Y, Liu Z, Sun M. Phrase type sensitive tensor indexing model for semantic composition [C] //Proc of AAAI. Menlo Park, CA: AAAI, 2015: 2195-2202.
  • 10Zhao Y, Liu Z, Sun M. Representation learning for measuring entity relatedness with rich information [C] //Proc of IJCAI. San Francisco, CA: Morgan Kaufmann, 2015: 1412-1418.

共引文献271

同被引文献49

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部