摘要
[目的/意义]科学术语承载学科基础知识与核心概念,对跨学科知识扩散的学术全文本引文中科学术语进行知识挖掘与特征计算,对深入探究跨学科的知识体系交叉融合规律与影响力具有重要意义。[方法/过程]以情报学为例,在无标注跨学科语料下,基于获取的权威科学术语知识体系,借助字序列标注模型与远程监督算法获取跨学科知识扩散科学术语抽取与分类的学习语料,进而探讨基于深度学习的最优模型,从知识发现角度定义新科学术语判别规则,最后进行科学术语跨学科知识扩散的学科分布、引用章节位置、概念专指度等多维特征计算。[结果/结论]Ro BERTa模型在各项指标上整体表现最优,其调和平均值达到98.08%,说明该算法能够保证跨学科知识扩散科学术语识别的可靠性和有效性。基于远程监督与深度学习的科学术语识别方法有利于挖掘跨学科知识扩散科学术语知识,可为跨学科知识扩散的智能知识挖掘提供领域化的基础计算资源支撑。多维特征计算能有效探究跨学科知识扩散科学术语交叉融合规律。
[Purpose/Significance]The scientific terms carry the basic knowledge and core concepts of disciplines.The knowledge mining and feature computation of scientific terms in transdisciplinary academic full-text citations are of great significance for in-depth investigation of the cross-fertilization pattern of transdisciplinary knowledge systems and transdisciplinary influence.[Method/Process]This paper took information science as an example.In the case of unlabeled transdisciplinary corpus,based on the acquired authoritative scientific terms knowledge system,it obtained the learning corpus of transdisciplinary knowledge diffusion of scientific terms extraction and classification with the help of word sequence annotation model and remote supervision.Then,it explored the optimal model based on deep learning,from the perspective of knowledge discovery,defined the new scientific terms discriminative rules,and finally it carried out the multidimensional feature computation such as discipline distribution,citation section distribution,concept specificity,etc.,of the transdisciplinary knowledge diffusion of scientific terms.[Result/Conclusion]The overall performance of RoBERTa-based model is optimal in various indexes,with an F-score of 98.08%,which indicates that the algorithm can ensure the reliability and effectiveness of the identification of transdisciplinary knowledge diffusion scientific terms.The scientific terms recognition method based on remote supervision and deep learning is conducive to mining the knowledge of transdisciplinary knowledge diffusion scientific terms,which can provide domain-oriented basic computing resources support for intelligent knowledge mining of transdisciplinary knowledge diffusion.The multidimensional feature computation can effectively explore the cross-fertilization pattern of transdisciplinary knowledge diffusion scientific terms granularity.
作者
孔玲
胡昊天
张卫
王东波
叶文豪
白如江
王效岳
Kong Ling;Hu Haotian;Zhang Wei;Wang Dongbo;Ye Wenhao;Bai Rujiang;Wang Xiaoyue(School of Information Management,Shandong University of Technology,Zibo 255049;School of Information Management,Nanjing University,Nanjing 210023;School of Information Management,Nanjing Agricultural University,Nanjing 210095)
出处
《图书情报工作》
北大核心
2024年第12期119-137,共19页
Library and Information Service
基金
国家自然科学基金青年项目“面向科技项目评价的创新知识图谱构建及知识推理研究”(项目编号:72304142)
国家社会科学基金项目“基于文本内容挖掘的学术论文影响力评价研究”(项目编号:19BTQ085)研究成果之一。
关键词
学科交叉
跨学科知识扩散
学术全文本引文
科学术语
知识抽取
interdisciplinarity
transdisciplinary knowledge diffusion
academic full-text citations
scientific terms
knowledge extraction