Geoscience knowledge graph(GKG)can organize various geoscience knowledge into a machine understandable and computable semantic network and is an effective way to organize geoscience knowledge and provide knowledge-rel...Geoscience knowledge graph(GKG)can organize various geoscience knowledge into a machine understandable and computable semantic network and is an effective way to organize geoscience knowledge and provide knowledge-related services.As a result,it has gained significant attention and become a frontier in geoscience.Geoscience knowledge is derived from many disciplines and has complex spatiotemporal features and relationships of multiple scales,granularities,and dimensions.Therefore,establishing a GKG representation model conforming to the characteristics of geoscience knowledge is the basis and premise for the construction and application of GKG.However,existing knowledge graph representation models leverage fixed tuples that are limited in fully representing complex spatiotemporal features and relationships.To address this issue,this paper first systematically analyzes the categorization and spatiotemporal features and relationships of geoscience knowledge.On this basis,an adaptive representation model for GKG is proposed by considering the complex spatiotemporal features and relationships.Under the constraint of a unified spatiotemporal ontology,this model adopts different tuples to adaptively represent different types of geoscience knowledge according to their spatiotemporal correlation.This model can efficiently represent geoscience knowledge,thereby avoiding the isolation of the spatiotemporal feature representation and improving the accuracy and efficiency of geoscience knowledge retrieval.It can further enable the alignment,transformation,computation,and reasoning of spatiotemporal information through a spatiotemporal ontology.展开更多
知识图谱作为当前最有效的知识组织和服务方式,已经成为人工智能的基石,在语义搜索、机器翻译、信息推荐等方面得到了广泛的应用。大数据时代下,地球科学(以下简称地学)分散、多源、异构数据的整合集成、挖掘分析及其知识的智能发现等...知识图谱作为当前最有效的知识组织和服务方式,已经成为人工智能的基石,在语义搜索、机器翻译、信息推荐等方面得到了广泛的应用。大数据时代下,地球科学(以下简称地学)分散、多源、异构数据的整合集成、挖掘分析及其知识的智能发现等迫切需要知识图谱的支撑。为了促进地学知识图谱的建设与应用,自2019年启动以来,“深时数字地球国际大科学计划”(Deep-time Digital Earth,简称DDE)就将知识图谱作为其重要的研究建设内容,经过3年多的建设,DDE已经建设形成了大量的地学知识图谱,亟需一站式共享这些知识图谱。文章首先介绍了DDE知识图谱内容体系,分析了DDE知识图谱内容组成及其特征;在此基础上,开展了地学知识图谱一站式共享服务系统的设计,包括系统功能体系和架构的设计;最后介绍了系统实现的技术路线及其关键技术。实践证明系统可有效实现DDE知识图谱的一站式共享服务,可为类似的知识共享服务系统提供参考。展开更多
Since the beginning of the 21 st century,the geoscience research has been entering a significant transitional period with the establishment of a new knowledge system as the core and with the drive of big data as the m...Since the beginning of the 21 st century,the geoscience research has been entering a significant transitional period with the establishment of a new knowledge system as the core and with the drive of big data as the means.It is a revolutionary leap in the research of geoscience knowledge discovery from the traditional encyclopedic discipline knowledge system to the computer-understandable and operable knowledge graph.Based on adopting the graph pattern of general knowledge representation,the geoscience knowledge graph expands the unique spatiotemporal features to the Geoscience knowledge,and integrates geoscience knowledge elements,such as map,text,and number,to establish an all-domain geoscience knowledge representation model.A federated,crowd intelligence-based collaborative method of constructing the geoscience knowledge graph is developed here,which realizes the construction of high-quality professional knowledge graph in collaboration with global geo-scientists.We also develop a method for constructing a dynamic knowledge graph of multi-modal geoscience data based on in-depth text analysis,which extracts geoscience knowledge from massive geoscience literature to construct the latest and most complete dynamic geoscience knowledge graph.A comprehensive and systematic geoscience knowledge graph can not only deepen the existing geoscience big data analysis,but also advance the construction of the high-precision geological time scale driven by big data,the compilation of intelligent maps driven by rules and data,and the geoscience knowledge evolution and reasoning analysis,among others.It will further expand the new directions of geoscience research driven by both data and knowledge,break new ground where geoscience,information science,and data science converge,realize the original innovation of the geoscience research and achieve major theoretical breakthroughs in the spatiotemporal big data research.展开更多
GeoLink has leveraged linked data principles to create a dataset that allows users to seamlessly query and reason over some of the most prominent geoscience metadata repositories in the United States.The GeoLink datas...GeoLink has leveraged linked data principles to create a dataset that allows users to seamlessly query and reason over some of the most prominent geoscience metadata repositories in the United States.The GeoLink dataset includes such diverse information as port calls made by oceanographic cruises,physical sample meta-data,research project funding and staffing,and authorship of technical reports.The data has been published according to best practices for linked data and is publicly available via a SPARQL Protocol and RDF Query Language(SPARQL)end point that at present contains more than 45 million Resource Description Framework(RDF)triples together with a collection of ontologies and geo-visualization tools.This article describes the geoscience datasets,the modeling and publication process,and current uses of the dataset.The focus is on providing enough detail to enable researchers,application developers and others who wish to lever-age the GeoLink data in their own work to do so.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.42050101)the National Key Research and Development Program of China(Grant Nos.2022YFB3904200&2021YFB00903)supported by the International Big Science Program of Deeptime Digital Earth(DDE)。
文摘Geoscience knowledge graph(GKG)can organize various geoscience knowledge into a machine understandable and computable semantic network and is an effective way to organize geoscience knowledge and provide knowledge-related services.As a result,it has gained significant attention and become a frontier in geoscience.Geoscience knowledge is derived from many disciplines and has complex spatiotemporal features and relationships of multiple scales,granularities,and dimensions.Therefore,establishing a GKG representation model conforming to the characteristics of geoscience knowledge is the basis and premise for the construction and application of GKG.However,existing knowledge graph representation models leverage fixed tuples that are limited in fully representing complex spatiotemporal features and relationships.To address this issue,this paper first systematically analyzes the categorization and spatiotemporal features and relationships of geoscience knowledge.On this basis,an adaptive representation model for GKG is proposed by considering the complex spatiotemporal features and relationships.Under the constraint of a unified spatiotemporal ontology,this model adopts different tuples to adaptively represent different types of geoscience knowledge according to their spatiotemporal correlation.This model can efficiently represent geoscience knowledge,thereby avoiding the isolation of the spatiotemporal feature representation and improving the accuracy and efficiency of geoscience knowledge retrieval.It can further enable the alignment,transformation,computation,and reasoning of spatiotemporal information through a spatiotemporal ontology.
文摘知识图谱作为当前最有效的知识组织和服务方式,已经成为人工智能的基石,在语义搜索、机器翻译、信息推荐等方面得到了广泛的应用。大数据时代下,地球科学(以下简称地学)分散、多源、异构数据的整合集成、挖掘分析及其知识的智能发现等迫切需要知识图谱的支撑。为了促进地学知识图谱的建设与应用,自2019年启动以来,“深时数字地球国际大科学计划”(Deep-time Digital Earth,简称DDE)就将知识图谱作为其重要的研究建设内容,经过3年多的建设,DDE已经建设形成了大量的地学知识图谱,亟需一站式共享这些知识图谱。文章首先介绍了DDE知识图谱内容体系,分析了DDE知识图谱内容组成及其特征;在此基础上,开展了地学知识图谱一站式共享服务系统的设计,包括系统功能体系和架构的设计;最后介绍了系统实现的技术路线及其关键技术。实践证明系统可有效实现DDE知识图谱的一站式共享服务,可为类似的知识共享服务系统提供参考。
基金supported by the National Natural Science Foundation of China(Grant Nos.41421001,42050101,and 42050105)。
文摘Since the beginning of the 21 st century,the geoscience research has been entering a significant transitional period with the establishment of a new knowledge system as the core and with the drive of big data as the means.It is a revolutionary leap in the research of geoscience knowledge discovery from the traditional encyclopedic discipline knowledge system to the computer-understandable and operable knowledge graph.Based on adopting the graph pattern of general knowledge representation,the geoscience knowledge graph expands the unique spatiotemporal features to the Geoscience knowledge,and integrates geoscience knowledge elements,such as map,text,and number,to establish an all-domain geoscience knowledge representation model.A federated,crowd intelligence-based collaborative method of constructing the geoscience knowledge graph is developed here,which realizes the construction of high-quality professional knowledge graph in collaboration with global geo-scientists.We also develop a method for constructing a dynamic knowledge graph of multi-modal geoscience data based on in-depth text analysis,which extracts geoscience knowledge from massive geoscience literature to construct the latest and most complete dynamic geoscience knowledge graph.A comprehensive and systematic geoscience knowledge graph can not only deepen the existing geoscience big data analysis,but also advance the construction of the high-precision geological time scale driven by big data,the compilation of intelligent maps driven by rules and data,and the geoscience knowledge evolution and reasoning analysis,among others.It will further expand the new directions of geoscience research driven by both data and knowledge,break new ground where geoscience,information science,and data science converge,realize the original innovation of the geoscience research and achieve major theoretical breakthroughs in the spatiotemporal big data research.
基金This work was supported by the National Science Foundation[1440202].
文摘GeoLink has leveraged linked data principles to create a dataset that allows users to seamlessly query and reason over some of the most prominent geoscience metadata repositories in the United States.The GeoLink dataset includes such diverse information as port calls made by oceanographic cruises,physical sample meta-data,research project funding and staffing,and authorship of technical reports.The data has been published according to best practices for linked data and is publicly available via a SPARQL Protocol and RDF Query Language(SPARQL)end point that at present contains more than 45 million Resource Description Framework(RDF)triples together with a collection of ontologies and geo-visualization tools.This article describes the geoscience datasets,the modeling and publication process,and current uses of the dataset.The focus is on providing enough detail to enable researchers,application developers and others who wish to lever-age the GeoLink data in their own work to do so.