期刊文献+

水文模型知识学习的命名实体识别方法研究 被引量:1

Research on named entity recognition method oriented to hydrological model knowledge learning
下载PDF
导出
摘要 为研究水利领域知识图谱构建中基于文本的知识自动抽取方法,本文以水文模型的名称、模拟要素、应用流域、计算时段、精度、继承-发展关系、研发人、研发单位等知识抽取为例,以883篇水文模型领域中文期刊论文为数据源,构建了BERT-Base-Chinese模型、LAC(Lexical Analysis of Chinese)工具、模式识别联合的多策略水文模型命名实体识别方法。本文采用五位序列标注法(BMOES)方法对期刊论文进行人工标注等处理,建立知识抽取的输入数据集,用于BERT模型训练以及多策略识别方法的性能评价。识别结果显示:多策略识别方法对8种水文模型领域命名实体识别结果精确率和召回率的调和平均数(F 1值)均达到90%以上;针对不同实体类别,采取不同的命名实体识别方法较单BERT模型识别方法能有效提高识别性能。本文提出的方法可为水利领域其他场景的知识抽取提供参考,为领域知识图谱构建提供支撑。 To investigate the construction of a knowledge graph in the field of water and hydropower,this study focuses on the automatic extraction method of knowledge based on text.Taking the extraction of knowledge related to hydrological models as an example,which includes model names,simulation elements,application basins,calculation periods,accuracy,inheritance-development relationships,developers,and research institutions,and utilizing 883 Chinese journal articles in the field of hydrological models as the data source,this study proposes a multi-strategy recognition method for named entity recognition in hydrological models,which combines the BERT-Base-Chinese model,LAC(Lexical Analysis of Chinese)tool,and pattern recognition.The BMOES method is used to manually annotate the journal articles to create the input dataset for knowledge extraction.This dataset is then used for training the BERT model and evaluating the performance of the multi-strategy recognition method.The results show that the multi-strategy recognition method achieves an F1 score of over 90%for the precise and recall rate of named entity recognition for eight hydrological model categories.Moreover,employing multi-strategy named entity recognition methods for various entity categories has enhanced the overall recognition performance in comparison to the singular approach using the BERT model.The proposed method in this study can serve as a reference for knowledge extraction in other scenarios in the field of water and hydropower and support the construction of field knowledge graphs.
作者 赵慧子 周逸凡 段浩 赵红莉 张东 ZHAO Huizi;ZHOU Yifan;DUAN Hao;ZHAO Hongli;ZHANG Dong(China Institute of Water Resources and Hydropower Research,Beijing 100038,China;Key laboratory of river basin digital twinning,ministry of water resource,Beijing 100038,China;Dalian Maritime University,Dalian 116026,China)
出处 《中国水利水电科学研究院学报(中英文)》 北大核心 2023年第6期574-585,共12页 Journal of China Institute of Water Resources and Hydropower Research
基金 科技创新2030重大项目(2021ZD0113602) 中国工程科技知识中心项目(CKCEST-2021-2-12,CKCEST-2022-1-35)。
关键词 水文模型知识 自然语言处理 命名实体识别 BERT模型 模式识别 知识抽取 hydrological model knowledge knowledge extraction natural language processing named entity recognition BERT pattern recognition
  • 相关文献

参考文献10

二级参考文献121

共引文献151

同被引文献28

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部