期刊文献+

学术文本中细粒度知识实体的关联分析 被引量:16

Association Analysis of Fine-Grained Knowledge Entities in Academic Texts
下载PDF
导出
摘要 考察特定领域文本中蕴含的细粒度知识实体的使用情况,对知识实体的评估和选择具有重要意义。学术文本中的细粒度知识实体通常具有多个类型、多种关联关系,挖掘知识实体的同质与异质关联关系,有助于深入了解特定领域知识实体的实际使用情况。目前相关研究大多针对学术文本中单一知识实体的抽取和评估,缺乏对知识实体间关系的关注,在一定程度上限制了基于实体抽取进行知识发现的能力。文章以自然语言处理领域为例,对学术论文全文中的细粒度知识实体关联数据进行挖掘,并通过可视化方式揭示关联数据中蕴含的信息。主要是选取全国计算语言学会议2009-2018年间收录的中文论文为原始语料,人工标注论文中使用的知识实体,并针对NLP特点将其细分为“指标实体”“工具实体”“资源实体”“方法实体”4种类型;结合关联规则挖掘算法Apriori和复杂网络分析软件构建知识实体关联网络,揭示该领域常用的知识实体,以及这些知识实体的使用相关性。 The study on fine-grained knowledge entities in domain-specific texts has great significance in the evaluation and selection of knowledge entities.There is a wide variety of fine-grained knowledge entities in academic texts,which usually relate to one another in multiple ways.Mining the homogeneous and heterogeneous association relationship between knowledge entities can help people understand the actual use of knowledge entities in specific fields more deeply.However,most of the current researches focus on the extraction and evaluation of a single knowledge entity in academic texts,lack of attention to the relationship between knowledge entities,which limits the ability of knowledge discovery based on entity extraction to a certain extent.Taking the field of natural language processing(NLP)as an example,this paper mines the data pertaining to the associations between finegrained knowledge entities in full academic articles,and displays the insightful information behind the data through visualization.Specifically,the full texts of Chinese papers published in the National Conference on Computational Linguistics(CCL)from 2009 to 2018 are selected as the original corpus,and the knowledge entities used in the papers are manually annotated.According to the characteristics of NLP,the knowledge entities are divided into four categories:"indicator entity","tool entity","resource entity"and"method entity".In this paper,a knowledge entity association network is established using Apriori algorithm,an association rule mining technique,and complex network analysis software,in an attempt to reveal the common knowledge entities in the field of NLP and their relevance in use.
作者 章成志 谢雨欣 宋云天 ZHANG Chengzhi;XIE Yuxin;SONG Yuntian
出处 《图书馆论坛》 CSSCI 北大核心 2021年第3期12-20,共9页 Library Tribune
基金 国家自然科学基金“基于学术文献全文内容的细粒度算法实体抽取与评估研究”(项目编号:72074113) 富媒体数字出版内容组织与知识服务重点实验室开放基金项目“富媒体数字出版内容中细粒度知识实体的抽取及关联与演化分析研究”(项目编号:ZD2020/09-04) 江苏省研究生科研与实践创新计划项目“学术文本中细粒度知识实体的抽取及关联与演化分析研究”(项目编号:KYCX20_0406)研究成果。
关键词 全文内容分析 细粒度知识实体 关联分析 full-text context analysis fine-grained knowledge entity association analysis
  • 相关文献

参考文献6

二级参考文献61

共引文献94

同被引文献315

引证文献16

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部