期刊文献+

面向高速铁路道岔运维文本的知识抽取方法

Knowledge extraction method for operation and maintenance texts of high-speed railway turnout
下载PDF
导出
摘要 为实现高速铁路道岔运维知识图谱的自动构建,从而为道岔智能运维提供决策支持,须利用知识抽取技术从高速铁路道岔运维文本中提取关键知识。同时,为进一步解决道岔运维文本中的实体嵌套与知识三元组重叠问题,提出一种基于多模块联合学习的高速铁路道岔运维知识抽取模型RTOM-KE。首先,依据定义的实体与关系类型,采用基于BIOES的2阶段知识标注策略,分别标注头实体以及相对应关系下的尾实体;其次,通过轻量级预训练模型BERT-base与双向长短期记忆神经网络BiLSTM构成的编码器模块获取文本的多维共享编码表示,将编码器模块的隐藏状态和全局上下文特征组合之后作为头实体抽取模块的输入;最后,通过头实体抽取模块提取文本中的所有候选头实体。同时,将候选头实体标签和来自编码器模块的多维共享词表征作为尾实体抽取模块的联合输入,通过特定的关系门机制筛选与头实体相关联的尾实体,从而获得高速铁路道岔运维知识三元组。通过充分的对比实验以及消融实验,结果表明:RTOM-KE模型能够较为精准全面地抽取不同复杂程度的三元组,有效解决实体嵌套与三元组重叠问题,基于道岔运维数据集的模型精确率、召回率与F1值分别可达88.3%、86.9%和87.6%。研究结果可为进一步构建高速铁路道岔运维知识图谱以及道岔智能运维提供支持。 To achieve the automatic construction of the knowledge graph and provide decision-making support for intelligent operation and maintenance of high-speed railway turnout,it is necessary to use knowledge extraction technology to extract key knowledge from high-speed railway turnout maintenance texts.At the same time,to further solve the problem of entity nesting and overlapping knowledge triplets in these texts,this article proposed a knowledge extraction model RTOM-KE for high-speed railway turnout operation and maintenance based on multi-module joint learning.Firstly,based on the defined entity and relation types,a two-stage knowledge labeling strategy based on BIOES was proposed to label the head entity and corresponding tail entity under the relation.Secondly,the encoding module composed of the lightweight pre-training BERT-base model and BiLSTM neural network was used to obtain the multi-dimensional shared encoding representation of the text.The hidden state of the encoding module and the global contextual features were combined as the input of the head entity extraction module.Finally,the head entity extraction module was used to extract all candidate head entities in the text.The candidate head entity labels and the multi-dimensional shared word representation from the encoding module were used as the joint input of the tail entity extraction module.The specific relation gate mechanism was used to filter the tail entities associated with the head entity to obtain the knowledge triplet of the high-speed railway turnout maintenance.Through sufficient comparative experiments and ablation experiments,the results are drawn as follows.The RTOM-KE model can accurately and comprehensively extract triplets of different complexities and effectively solve the problems of entity nesting and triplet overlapping.The Precision,Recall,and F1 values of RTOM-KE model based on the turnout operation and maintenance dataset can reach 88.3%,86.9%,and 87.6%,respectively.The research results can provide reference for further improving the knowledge extraction efficiency of more complex high-speed railway turnout maintenance texts and information extraction in other professional fields.
作者 林海香 白万胜 赵正祥 胡娜娜 李冬 陆人杰 LIN Haixiang;BAI Wansheng;ZHAO Zhengxiang;HU Nana;LI Dong;LU Renjie(School of Automation and Electrical Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China;CASCO Signal Ltd.,Shanghai 200071,China)
出处 《铁道科学与工程学报》 EI CAS CSCD 北大核心 2024年第7期2569-2580,共12页 Journal of Railway Science and Engineering
基金 甘肃省重点研发计划-工业类(23YFGA0046) 四电BIM工程与智能应用铁路行业重点实验室2022年度开放课题(BIMKF-2022-02)。
关键词 高速铁路 道岔 运维 知识抽取 BERT模型 high-speed railway turnout operation and maintenance knowledge extraction BERT model
  • 相关文献

参考文献8

二级参考文献95

共引文献244

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部