摘要
[目的/意义]关键词作为一种能够揭示学术文本主题及核心内容的词汇或术语,对其进行功能识别可为知识和文献的快速、精确获取提供底层索引支持。[方法/过程]针对现有研究在关键词上下文建模中多局限于文本层面的符号语义表征,在深入挖掘文献行书规律的基础上,提出一种基于多特征融合的词汇功能识别模型。模型在采用BERT模型捕获关键词上下文依赖特征的同时,融合关键词在关键词列表和全文中的位置信息以及词汇功能先验知识信息,继而采用注意力机制和前馈神经网络对关键词进行问题方法的语义功能判别。[结果/结论]实验结果显示,关键词的位置信息和先验知识均能有效提升关键词语义功能识别性能,其中先验知识对识别效果的提升有较大贡献。
[Purpose/significance]Keywords,as a kind of vocabulary or term that can reveal the subject and core content of a text,can identify the functions and provide the underlying index support for fast and accurate acquisition of knowledge and documents.[Method/process]Aiming at the existing studies that are mostly limited to the semantic representation of symbols at the text level in vocabulary context modeling,this paper proposes a lexical function recognition model based on multi-feature fusion.On the basis of capturing the context-dependent features of keywords using the BERT model,the position information of keywords in the keyword list and the full text and prior knowledge of vocabulary functions are fused,and then the attention mechanism and feed-forward neural network are used for the identification of key words by problem-solving method.[Result/conclusion]The experimental results show that both the location information and priori knowledge of the keywords can improve their word function recognition effect,and the prior knowledge has a greater contribution to the recognition effect.
作者
张国标
李鹏程
陆伟
程齐凯
Zhang Guobiao;Li Pengcheng;Lu Wei;Cheng Qikai(School of Information Management,Wuhan University,Wuhan 430072;Institute for Information Retrieval and Knowledge Mining,Wuhan University,Wuhan 430072)
出处
《图书情报工作》
CSSCI
北大核心
2021年第9期89-96,共8页
Library and Information Service
基金
国家自然科学基金项目"基于多语义信息融合的学术文献引文推荐研究"(项目编号:71673211)
国家自然科学基金青年项目"基于深度语义挖掘的引文推荐多样化研究"(项目编号:71704137)研究成果之一。