摘要
目的:综合运用自然语言处理、结构化算法和知识图谱等技术,探索实现电子病历高精度信息抽取和结构化处理的方法。方法:通过构建命名实体识别模型、关系识别模型、同义词识别模型完成病历文本的句内信息抽取;提出了一种病历生成树算法,可以有效实现大段落病历文本分层结构的解析;同时利用知识图谱技术存储信息抽取与分层解析的构造模型,实现病历文本信息高精度抽取。结果:形成了一套融合深度学习算法与结构化解析算法的病历信息高精度抽取方法,其中实体识别模型准确率达95.74%,关系识别模型准确率达89.20%,最终生成具有清晰层次结构、可精确定位和抽取信息的结构化病历。结论:本文所探索的病历信息高精度抽取方法,将深度学习算法与结构化解析算法相融合,兼顾了病历文本的句内信息抽取与病历结构层次的解析,可以实现对病历数据的自动抽取、精准定位与高效管理,可以为临床医学研究奠定数据基础,也可以为其他疾病病历文本数据的挖掘提供方法学参考。
Objective To explore the method of high-precision information extraction and structured processing of electronic medical record by adopting natural language processing,structured algorithm and knowledge graph.Methods By constructing named entity recognition model,relation recognition model and synonym recognition model,the in-sentence information was extracted from medical record.A medical record spanning tree algorithm was proposed,which can effectively analyze the hierarchical structure of large paragraph medical record text.At the same time,knowledge graph technology was used to store the construction models of information extraction and hierarchical analysis to realize high-precision extraction of medical record text information.Results A set of high-precision medical record extraction methods integrated with deep learning algorithm and structural analysis algorithm were formed,in which the accuracy of entity recognition model was 95.74%,and the accuracy of relationship recognition model was 89.20%.Finally,a structured medical record with clear hierarchical structure was generated,which can accurately locate and extract information.Conclusion The high-precision extraction method of medical record information explored in this paper integrates deep learning algorithm with structured analysis algorithm,takes into account the information extraction in sentences of medical record text and the analysis of medical record structure level,which can realize the automatic extraction,precise positioning and efficient management of medical record data,lay the data foundation for clinical medical research and also provide methodological reference for the mining of medical record text data of other diseases.
作者
王维笑
费晓璐
闾海荣
魏岚
陶焜
赵明
付旭
赵许盼
高菲
任怡
WANG Weixiao;FEI Xiaolu;LYU Hairong;WEI Lan;TAO Kun;ZHAO Ming;FU Xu;ZHAO Xupan;GAO Fei;REN Yi
出处
《中国数字医学》
2024年第5期40-48,共9页
China Digital Medicine
基金
国家重点研发计划资助(2022YFF1202400)。
关键词
电子病历
信息抽取
自然语言处理
知识图谱
结构化解析
Electronic medical record
Information extraction
Natural language processing
Knowledge graph
Structured analysis