摘要
【目的/意义】基于知识图谱构建文本语义图,可以解决传统文本表示方法中语义缺失的问题。【方法/过程】首先基于百科知识图谱CN-DBpedia构建以文本中的命名实体为节点、实体之间的语义关系为边的语义图,然后引入概念图谱CN-Probase,实现实体和概念之间的映射,进而生成融入概念知识图谱的增强型中文文本语义图,最后以新闻文本的模式发现任务为例对本文提出的方法进行了验证。【结果/结论】提出了一种新型的基于多知识图谱构建中文文本语义图的方法。【创新/局限】实现了实体层面和概念层面两个层次的中文文本语义化表示,可应用于文本分类、文本分析等自然语言处理任务,局限在于只使用了新闻文本进行实验验证。
【Purpose/significance】Text representation using traditional methods is often unsatisfactory because it ignores the semantic relationships between words. Constructing text semantic graph based on knowledge graph can solve the problem of semantic deficiencies.【Method/process】First, a semantic graph with named entities in the text as nodes and semantic relationships between entities as sides is constructed based on the encyclopedia Knowledge Graph, CN-DBpedia. Second, the concept graph CN-Probase is introduced to map entities to concepts, and an enhanced Chinese text semantic graph with embedded concept knowledge graph is generated. Finally, the method proposed in this paper is validated by using the pattern discovery task of news text as an example.【Result/conclusion】This study proposed a novel method to construct Chinese text semantic graph based on multiple knowledge graphs.【Innovation/limitation】This study implements the semantic representation of text at both entity level and conceptual level, which can be widely used in nature language processing tasks such as text representation and text analysis. The limitation of this study is that the method we proposed only be validated in news texts.
作者
赵一鸣
吴林容
任笑笑
ZHAO Yi-ming;WU Lin-rong;RENXiao-xiao(Center for Studies of Information Resources,Wuhan University,Wuhan 430072,China;School of Information Management,Wuhan University,Wuhan 430072,China;National Demonstration Center for Experimental Library and Information Science Education,Wuhan University,Wuhan 4300072,China)
出处
《情报科学》
CSSCI
北大核心
2021年第4期23-29,共7页
Information Science
基金
教育部人文社会科学研究项目“基于复杂语义关系的词汇共现机理研究”(18YJC870026)
国家自然科学基金面上项目“探寻式搜索过程中的路径识别与评价研究”(71874130)
国家自然科学基金创新研究群体项目“信息资源管理”(71921002)
中国科协青年人才托举工程(2017QNRC001)。
关键词
中文文本语义图
文本表示
文本分析
知识图谱
语义信息
Chinese text semantic graph
text representation
text analysis
knowledge graph
semantic information