期刊文献+

面向中文科技文献非结构化摘要的知识元表示与抽取研究——基于知识元本体理论 被引量:17

Research on Knowledge Unit Representation and Extraction for Unstructured Abstracts of Chinese Scientific and Technical Literature: Ontology Theory Based on Knowledge Unit
下载PDF
导出
摘要 [目的/意义]近年来,科技文献资源呈爆炸性增长,海量科技文献中依旧存在大量非结构化摘要。非结构化摘要一方面不利于学者阅读与理解;另一方面不利于对摘要内部信息进行知识的自动化抽取和相应的检索。研究科技文献非结构化摘要的知识表示模型及其自动化抽取方法,对学者快速阅读和机器自动化处理具有重要意义。[方法/过程]文章在分析科技文献非结构化摘要结构的基础上,结合知识元本体理论,构建了一个面向科技文献非结构化摘要的知识元本体模型。通过分析非结构化摘要的写作特征,将文本按句子级划分为目的、方法、结果或结论三个要素,统计每个要素句中的线索词、句型和位置,建立相关规则库,根据本体模型和规则库构建相关抽取算法。最后,下载《计算机技术与发展》中的部分文献进行实验。[结果/结论]通过增加句型集和线索词集,完善了非结构化摘要的要素,构建了非结构化摘要知识元本体模型。实验结果表明,根据本文提出的模型能有效地对非结构化摘要中的知识元进行抽取。[局限]实验的不足之处是需要人工对摘要中的句型和线索词进行归纳总结。 [Purpose/significance]In recent years,the resources of scientific and technological literature are increasing explosively,and there are still a large number of unstructured abstracts in the massive scientific literature.On the one hand,unstructured abstract is not conducive to the reading and understanding of scholars,and on the other hand,it is not conducive to the automatic extraction and corresponding retrieval of knowledge of the internal information of the abstract.It is of great significance for scholars to quickly read and automate the processing of knowledge representation models and their automated extraction methods for the unstructured abstracts of scientific literature.[Method/process]Based on the analysis of the unstructured abstract structure of scientific literature,this paper constructs a knowledge unit ontology model for the unstructured abstract of scientific literature based on the knowledge unit ontology theory.By analyzing the writing characteristics of unstructured abstracts,the text is divided into three units:purpose,method,result or conclusion according to the sentence level.The clue words,sentence patterns and positions in each element sentence are counted,and the relevant rule base is established.According to the ontology,the model and rule base construct a correlation extraction algorithm.Finally,download some of the literature in Computer Technology and Development for experimentation.[Result/conclusion]This paper improves the units of unstructured abstracts by adding sentence patterns and clues,and constructs an unstructured abstract knowledge unit ontology model.The experimental results show that the model proposed in this paper can effectively extract the knowledge units in the unstructured abstract.[Limitations]The shortcoming of the experiment is that the sentence patterns and clue words in the abstract need to be summarized manually.
出处 《情报理论与实践》 CSSCI 北大核心 2020年第2期157-163,共7页 Information Studies:Theory & Application
基金 国家自然科学基金项目“知识社区中的资源语义空间及其检索研究”的成果,项目编号:71573199
关键词 科技文献 非结构化摘要 知识表示 知识抽取 知识元 本体模型 scientific literature unstructured abstract knowledge representation knowledge extraction knowledge unit ontology model
  • 相关文献

参考文献9

二级参考文献88

共引文献106

同被引文献189

引证文献17

二级引证文献45

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部