期刊文献+

重建知识源流:将结构化知识自动溯源至史籍原文

Recapturing the Flow of Knowledge:Tracing Structured Knowledge Back to Historical Records
下载PDF
导出
摘要 将结构化历史知识溯源至史籍原文能够提升知识的可验证性和可靠性。本研究针对古籍知识库缺乏完善知识溯源机制和部分古汉语文本缺乏触发词的问题,提出了一种将结构化历史知识溯源至史籍原文的方法。首先,结合共指消解、文本蕴涵等技术和方法,提出了结构化历史知识溯源框架;其次,在构造数据集的基础上,通过实验对比了BERT(bidirectional encoder representations from transformers)、SikuBERT与GPT-3(generative pre-trained transformer 3)、GPT-4等不同预训练模型和不同输入策略对知识溯源效果的影响,构建了结构化历史知识溯源模型SHK-Tracer(structured historical knowledge tracing model),其精确率为80.19%;最后,采用SHK-Tracer将史记多维知识库(Shiji Mutil-dimensional Knowledge Base,SMKB)分别溯源至不同的史书,发现《史记》与《左传》《国语》中各史料片段的知识重合度及片段本身所包含的信息含量不成正比。本研究结果一方面能够支持相关读者核验知识真伪、提供不同史料之间的相互参照以及结合史料年代等信息确定知识源头,另一方面能够为史籍知识计量、关系抽取和语言风格计算等数字人文研究提供基础语料。 Tracing structured historical knowledge back to historical records can enhance the verifiability and reliability of knowledge.In response to the challenges of inadequate knowledge tracing mechanisms in existing knowledge bases of ancient books and the absence of trigger words in several Archaic Chinese texts,this study introduces a method to trace structured historical knowledge back to historical records.First,a structured historical knowledge tracing framework is proposed by leveraging techniques such as co-reference resolution and textual entailment.Subsequently,a dataset is proposed to compare the effectiveness of different pre-trained language models, including BERT, SikuBERT, GPT-3, and GPT-4. This dataset combined with different input strategies on the knowledge tracing effect, is used to structure the historical knowledge tracing model, SHK-Tracer, which was employed to trace the historical subject matter knowledge base (Shiji Mutil-dimensional Knowledge Base, SMKB) to different ancient historical books, with 80.19% precision. We found that the knowledge overlap between Shiji and each historical fragment in historical books, such as Zuozhuan and Guoyu, did not correlate proportionally with the inherent information content of the historical fragment. The results of the study serve the dual purpose of first, aiding scholars and readers in verifying the authenticity of knowledge, by providing cross-refer‐ences between different historical sources and identifying the original source;and second, facilitating digital humanities re‐search, including historical knowmetrics, relation extraction, and linguistic style calculations of ancient historical records.
作者 张琪 孔嘉 胡昊天 王东波 王昊 邓三鸿 Zhang Qi;Kong Jia;Hu Haotian;Wang Dongbo;Wang Hao;Deng Sanhong(School of Information Management,Nanjing University,Nanjing 210023;Key Laboratory of Data Engineering and Knowledge Services in Provincial Universities(Nanjing University),Nanjing 210023;College of Information Management,Nanjing Agricultural University,Nanjing 210095;Research Center for Humanities and Social Computing,Nanjing Agricultural University,Nanjing 210095)
出处 《情报学报》 CSSCI CSCD 北大核心 2024年第4期405-415,共11页 Journal of the China Society for Scientific and Technical Information
基金 国家社会科学基金重大项目“中国古代典籍跨语言知识库构建及应用研究”(21&ZD331)。
关键词 知识服务 知识溯源 知识计量 数字人文 知识三元组 knowledge service knowledge provenance knowmetrics digital humanities SPO triples
  • 相关文献

参考文献6

二级参考文献75

共引文献96

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部