摘要
[目的/意义]特定领域中的信息检索系统,往往因为用户自身领域知识不足存在检索效率低下的问题。现有的面向特定领域的知识组织工具,如领域知识图谱,可以有效缓解这一问题,但是如何将其更好地嵌入到现有信息检索系统中是目前尚未解决的问题。因此,提出一套基于领域知识图谱的探索式搜索系统的全流程解决方案,包括从实体及关系的联合抽取、领域知识图谱构建到探索式搜索系统的实现。[方法/过程]以图情领域学术文献为研究对象,对图情领域的实体及其间关系的形式化定义进行深入的探索、提出基于Paddle UIE实现实体及关系的联合抽取任务方法,并构建一个基于可交互图情领域知识图谱的原型检索系统。[结果/结论 ]通过比较不同实体及关系联合抽取方法的效果,包括Paddle UIE、Cas Rel、Sp ERT和CORE,发现基于提示学习的预训练大语言模型Paddle UIE具有更好的抽取效果,特别是在样本量较少的情况下。此外,从进一步设计的两个搜索实验任务结果中可以发现,与通用搜索引擎相比,本文系统可以显著提高用户满意度,有效解决用户领域知识不足(如跨学科场景)下检索性能低下的问题,表明本文提出的全流程解决方案可以为学术文献服务提供方开发用户支持工具提供流程指导和建议。
[Purpose/Significance] Information retrieval systems in specific domains often suffer from low retrieval efficiency due to the users' lack of domain knowledge.Existing domain-specific knowledge organization tools,such as domain knowledge graphs,can effectively alleviate this problem,but how to better integrate them with existing information retrieval systems is a problem that has not been solved yet.Therefore,this paper proposes a full-process solution for implementing an exploratory search system based on domain knowledge graphs,including joint extraction of entities and their relations,domain knowledge graph construction and implementation of exploratory search system.[Method/Process] This paper focused on the academic literature in library and information science(LIS) and explored the formal definition of entities and their relations in LIS.It also proposed a joint extraction method of entities and relations on Paddle UIE,and built a prototype retrieval system on an interactive LIS knowledge graph.[Result/Conclusion] By comparing the effects of different joint extraction methods of entities and relations,including Paddle UIE,CasRel,SpERT and CORE,it finds that the pre-trained large language model Paddle UIE based on prompt learning has better extraction performance,especially in the case of small sample size.In addition,this paper further designs two search experiment tasks to analyze the effect of exploratory search system.Compared with general search engines,this system can significantly improve user satisfaction and effectively solve the problem of low retrieval performance due to insufficient user domain knowledge(such as cross-disciplinary scenarios),indicating that the full-process solution proposed in this paper can provide process guidance and suggestions for academic literature service providers to develop user support tools.
作者
王娟
曹树金
王志红
彭碧涛
Wang Juan;Cao Shujin;Wang Zhihong;Peng Bitao(School of Information Science and Technology,Guangdong University of Foreign Studies,Guangdong 510420;Institute of Information Management,Shandong University of Technology,Zibo 255000;Zhongguancun Laboratory,Beijing 102206)
出处
《图书情报工作》
CSSCI
北大核心
2024年第3期105-116,共12页
Library and Information Service
基金
国家社会科学基金一般项目“基于深度学习的学科领域网络学术情报发现研究”(项目编号:18BTQ065)研究成果之一。
关键词
领域知识图谱
提示学习
大模型
联合抽取
探索式搜索
domain knowledge graph
prompt learning
large language model
joint extraction
exploratory search