摘要
【目的】实现STKOS超级科技词表从关系数据库到RDF数据的自动转换。【方法】构建STKOS超级科技词表语义描述模型,针对STKOS超级科技词表的数据存储情况和数据特点,分别建立将科技术语、规范概念、范畴类、来源概念和术语等从关系数据库存储字段转换到RDF数据集的R2RML映射文档,并利用R2RML Parser工具执行自动批量转换。【结果】完成STKOS超级科技词表大规模发布数据的RDF转换,生成1.9亿RDF三元组,并存入Virtuoso数据库中提供SPARQL查询功能。【局限】R2RML的自定义谓词不够灵活,对于复杂数据结构需要进行预先拆分和转换。【结论】本文基于R2RML开展了STKOS超级科技词表的RDF转换实践,其映射方法可为其他关系数据库或叙词表的RDF转换提供参考。
[Objective] This paper aims to convert STKOS Metathesaurus from records of relational database to RDF triples. [Methods] First, we defined the semantic schema of the STKOS based on their storage features and data characteristics. Then, we mapped the scientific terms, standard concepts, categories, as well as source concepts and terms with the help of R2RML. Finally, we converted the documents stored in relational database to RDF datasets with the R2RML parser. [Results] The proposed method could process STKOS metathesaurus automatically and generated 190 million RDF triples. All new records were stored in the Virtuoso database and could be queried with SPARQL.[Limitations] Predicates in the R2RML lacks flexibily, therefore, more complex data sets need to be splited and transformed first. [Conclusions] The proposed model shed light on future research on converting other relational database records or thesaurus to RDF datasets.
作者
王颖
吴思竹
Wang Ying;Wu Sizhu(National Science Library,Chinese Academy of Sciences,Beijing 100190,China;Institute of Medical Information,Chinese Academy of Medical Sciences,Beijing 100020,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2018年第12期89-97,共9页
Data Analysis and Knowledge Discovery
基金
国家科技图书文献中心"下一代国家科技创新知识服务开放系统"先期研发任务课题"STKOS关联数据发布及开放共享关键技术研究"(项目编号:XQYF0104)
国家社会科学青年基金项目"基于R2RML的RDB到RDF转换模式研究与实现"(项目编号:13CTQ009)的研究成果之一