期刊文献+

混凝土坝施工文档实体知识智能挖掘方法 被引量:10

Intelligent data mining approach of text entity knowledge from construction documents of concrete dams
下载PDF
导出
摘要 混凝土坝施工信息多以文档文本的形式呈现,其体量大、分布广、内在关系复杂,人工操作难以准确、高效地提取信息知识内容,理清错综复杂的施工信息关系。在自然语言处理技术中,命名实体是文本信息知识的载体,实现精确快速的实体识别是施工知识挖掘的重要前提。本文提出一种融合深度学习与关联规则技术的混凝土坝施工文档知识智能识别及挖掘分析方法。该方法耦合双向长短期记忆神经网络(bi-directional long-short term memory,Bi-LSTM)与条件随机场(conditional random field,CRF),定义混凝土坝施工实体类型,构建命名实体识别模型,形成混凝土坝施工实体知识集合;在此基础上,考虑施工文本表达规律及实体类型,预定义实体之间关系,确定施工实体组合形式,形成实体关联规则提取技术;以实体关联规则提取技术为导向,改进Apriori算法计算频繁项集,获得实体间的强关联规则。该方法应用于实际混凝土坝施工监理周报中,经过计算得到命名实体识别的精确率为86.42%,验证了该方法的准确性。利用改进Apriori算法分析实体间的关联规则,证明了改进算法的优势,有助于提升混凝土坝施工文档知识分析的智能化与精细化水平。 The construction information of concrete dams is mostly expressed in form of document text,which is characterized by a wealth of information,wide distribution,and complex internal relations;manual operation finds it difficult to accurately extract information knowledge and sort out complicated relationships of construction information.In natural language processing,named entities are the carriers of text information,and realizing accurate and fast entity recognition is an important premise of construction knowledge mining.This paper describes a knowledge intelligent recognition and analysis method that combines deep learning and association rule technique for processing the construction documents of concrete dams.The types of concrete dam construction entities are defined;the bi-directional long-short term memory(Bi-LSTM)and conditional random field(CRF)methods are used to build named entity recognition models and generate construction entity knowledge sets.Further,we develop an entity association rule extraction technique by considering the expression rules and entity types of the text,predefining the relationships between the entities,and determining their combination forms.And we use this method to improve the Apriori algorithm and obtain strong association rules by calculating the frequent itemset.Application to the weekly report text for construction supervision of a concrete dam verifies the method,and shows its accuracy of 86.4%in recognition of named entities.The improved Apriori algorithm is used to analyze the association rules between the entities,demonstrating its advantages and usefulness in raising the intelligence and refinement level of document knowledge extraction and analysis for concrete dam construction.
作者 田丹 沈扬 李明超 韩帅 TIAN Dan;SHEN Yang;LI Mingchao;HAN Shuai(State Key Laboratory of Hydraulic Engineering Simulation and Safety,Tianjin University,Tianjin 300350;China Three Gorges Corporation,Beijing 100038)
出处 《水力发电学报》 CSCD 北大核心 2021年第6期139-151,共13页 Journal of Hydroelectric Engineering
基金 国家自然科学基金(51879185) 国家重点研发计划(2018YFC0406905) 湖北省水电工程施工与管理重点实验室开放基金(2020KSD05)。
关键词 混凝土坝 施工文档 命名实体 智能识别 深度学习 知识挖掘 concrete dam construction document named entity intelligent recognition deep learning knowledge mining
  • 相关文献

参考文献11

二级参考文献286

共引文献313

同被引文献156

引证文献10

二级引证文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部