期刊文献+

基于油气勘探小数据集的知识图谱表示与融合

Knowledge Graph Representation and Fusion Based on Oil and Gas Exploration Small Datasets
下载PDF
导出
摘要 针对油气勘探数据复杂性和特殊性,以及图谱创建初期数据贫乏的业务痛点,提出了一种适用于油气勘探的小数据集的知识表示和融合方法。为了解决知识融合中数据异构问题,采用数据库和网络本体语言(Web Ontology Language,OWL)语言相结合的方法进行图谱表示,让多种结构的数据可以用统一的方式相结合,同时兼顾数据内容的可靠性。为了解决训练数据有限,模型和聚类算法难以使用等问题,基于多种融合的技术,在本体融合和实体融合两个方面进行全方位研究。其中:本体通过无监督学习基于文本匹配算法计算图谱节点特征的相似度,通过主动学习以人机交互的形式修正数据,获得最终结果;实体融合基于本体融合的思路,通过逆向文档频率(Inverse Document Frequency,IDF)系数和结构相似度等算法优化两个实体相似度的准确率。该技术力求在油气勘探领域找到一种易于优化、对数据要求较低、效果相对较好的知识融合框架。 With the increasingly extensive application of Big data in the field of oil and gas exploration,many experts in the field of oil and gas exploration have begun to establish Knowledge graph to integrate relevant knowledge,but there are still many"knowledge islands",and oil and gas exploration personnel are still unable to share these knowledge.In view of the complexity and particularity of oil and gas exploration data and the lack of data at the initial stage of map creation,this paper proposes a Knowledge representation and reasoning and fusion method for small data sets suitable for oil and gas exploration,which has irreplaceable significance for map construction in the oil and gas exploration field.In order to solve the problem of data heterogeneity in knowledge fusion,the technology described in this paper adds a Knowledge representation and reasoning module,and uses the method of combining database and Web Ontology Language(OWL)language to represent the atlas,so that data with multiple structures can be combined in a unified way,while taking into account the reliability of data content.Knowledge fusion can be divided into ontology fusion and entity fusion.There are many fusion technologies,including embedded representation based on model training,and many Unsupervised learning methods.The technology described in this paper is mainly for the initial stage of map construction in the oil and gas exploration professional field,where data collection is difficult,the amount of data available for training is very limited,and various models and clustering algorithms are difficult to use.Ontology calculates the similarity of map node features based on the text matching algorithm through Unsupervised learning,and then corrects the data in the form of human-computer interaction through active learning to obtain the final result.Entity fusion follows the approach of ontology fusion,and on this basis,the accuracy of the similarity between the two entities is optimized through algorithms such as the Inverse Document Frequency(IDF)coefficient and structural similarity.This technology strives to find a knowledge fusion framework in the field of oil and gas exploration that is easy to optimize,has low data requirements,and has relatively good results.
作者 田芷瑜 卫乾 赵世亮 许野 TIAN Zhiyu;WEI Qian;ZHAO Shiliang;XU Ye(Kunlun Digital Intelligence Technology Co.,Ltd.,Beijing 102206,China)
出处 《信息与电脑》 2023年第19期164-170,共7页 Information & Computer
关键词 知识表示 知识融合 知识图谱 油气勘探 无监督学习 knowledge representation knowledge fusion knowledge graph geological exploration unsupervised learning
  • 相关文献

参考文献5

二级参考文献92

  • 1秦洪武.第三人称代词在深层回指中的应用分析[J].当代语言学,2001,3(1):55-64. 被引量:34
  • 2孙茂松,黄昌宁,高海燕,方捷.中文姓名的自动辨识[J].中文信息学报,1995,9(2):16-27. 被引量:87
  • 3唐杰,梁邦勇,李涓子,王克宏.语义Web中的本体自动映射[J].计算机学报,2006,29(11):1956-1976. 被引量:98
  • 4蒋龙,周明,简立峰.利用音译和网络挖掘翻译命名实体[J].中文信息学报,2007,21(1):23-29. 被引量:11
  • 5马彦华 黄昌宁 等.汉语中人称代词指代问题研究.1998年中文信息处理国际会议论文集[M].北京,1998..
  • 6郭志立.人称代词指代主体的辨析及其在摘要提取中的应用.1998年中文信息处理国际会议会论文集[M].北京清华大学出版社,1998.310-315.
  • 7NIST. The ACE 2007 (ACE07) Evaluation Plan: Evaluation of the Detection and Recognition of ACE Entities, Values, Temporal Expressions, Relations, and Events [EB/OL]. [-2007]. http://www, hist. gov/ speech/tests/ace/2OOT/doc/aceOT-evalplan, vl. 3a. pdf.
  • 8Nancy A. Chinchor. Overview of MUC-7/MET-2[C]//Proceedings of the Seventh Message Under- standing Conference (MUC-7), Fairfax, Virginia, 1998.
  • 9Gina Anne Levow. The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition[C]//Proceedings of the Fifth SigHAN Workshop on Chinese Language Processing, Sydney: Association for Computational Lin- guistics, 2006:108 117.
  • 10A. Mikheev, C. Grover, Moens M. Description of the LTG System Used for MUC-7[C]//Proceedings of 7th Message Understanding Conference ( MUC-7 ), Fairfax, Virginia, 1998.

共引文献98

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部