摘要
本文提出基于"主语-谓语-宾语(Subject-Predication-Object,SPO)"三元组的生物医学领域知识发现框架,对该框架的关键技术和实施流程进行研究。首先,基于UMLS语料库,利用Sem Rep工具从生物医药文献中抽取SPO三元组;其次,基于领域知识组织体系,结合自定义词表和清洗规则对SPO进行清洗和筛选;再次,利用Net Miner分别绘制以Subject和Object为中心节点,Predication为边的语义网络图;最后,结合专家解读,实现领域知识发现。本文以诱导多能干细胞领域为例进行实证研究。结果显示,SPO三元组可细粒度地揭示科技文献的知识内容,基于SPO的语义网络能直观地支持领域知识发现,该框架具有兼容、高效、易实施等优点。
This paper summarizes a set of knowledge discovery framework to make studies on knowledge discovery in biomedical literature based on SubjectPredication-Object(SPO) predications, and studies the key technology and implementation process of the framework. First, SPO predications were extracted from the biomedical literature by using UMLS corpus and Sem Rep; then, according to the knowledge organization system, vocabulary and cleaning rules were self-defined, the SPOs were cleaned and filtered; next, semantic network diagrams were constructed by Net Miner, which included subjects and objects as the center nodes and predications as the edges; finally, combining the diagrams and experts' interpretation, domain knowledge discovery was achieved. In this paper, an empirical study was conducted to investigate the field of pluripotent stem cells. Research results show that, SPO predications can reveal the knowledge content of scientific literature, and SPOs-based semantic networks can intuitively support domain knowledge discovery. The framework is compatible, efficient and easy to implement.
出处
《数字图书馆论坛》
CSSCI
2017年第9期28-34,共7页
Digital Library Forum
基金
中国科学技术信息研究所ISTIC-EBSCO文献大数据发现服务联合实验室基金项目"基于SemRep与SKOS的科技文献语义知识组织应用示范研究"资助
关键词
知识发现
SPO
知识组织
语义网络
Knowledge Discovery
SPO
Knowledge Organization
Semantic Network