摘要
[目的/意义]为更好地提升科技文献的语义丰富化效果,对国内外科技文献语篇元素标注模型、技术和方法进行调研总结,为文本挖掘、科技论文知识抽取、语义分析系统研究者提供借鉴。[方法/过程]利用学术网站搜索和相关数据库搜索引擎,对涉及科技论文标注、语篇元素、知识抽取、句子识别和自动文章分类等参考文献以及研究报告进行深入阅读和调研,对语篇元素自动标注模型以及相关工作进展进行研究总结。[结果/结论]科技文献语篇元素标注具有非常重要的实际应用价值,构建标注模型需充分考虑构建思想、标注领域和标注粒度以及标注技术手段等方面。
[Purpose/significance] In order to improve the semantic enrichment effect of scientific and technical literature, this paper summarizes the domestic and foreign scientific and technical literature discourse elements automatic an- model, technologies and methods, and nalysis system. [ Method/process] This paper provides reference for text mining, knowledge extraction and semantic a- used Web Scholar and related database search engine to conduct in-depth reading and related research on references and research reports involving scientific and technical papers annotation, dis- course elements, knowledge extraction, sentence recognition, automatic article classification, etc. and summarized the re-search the main technologies of each module in the framework. [ Result/conclusion ] The annotation of scientific literature discourse elements has very important practical application value. The construction of annotation model needs to take full account of construction thought, annotation field and annotation granularity as well as annotation techniques
作者
于改红
张智雄
马娜
Yu Gaihong;Zhang Zhixiong;Ma Na(University of Chinese academy of sciences,Beijing 100049;National Science Library,Chinese Academy of Sciences,Beijing 100190;Wuhan Library,Chinese Academy of Sciences,Wuhan 430071)
出处
《图书情报工作》
CSSCI
北大核心
2018年第15期132-144,共13页
Library and Information Service
基金
中国科学院文献情报能力建设专项项目“基于arXiv数据的物理领域科研论文自动语义标注和索引应用示范”(项目编号:院1657)研究成果之一
关键词
科技文献
语篇元素
标注模型
自动标注
scientific and technical literature
discourse elements
annotation model
automatic annotation