摘要
目的针对中文医学病历文本,进行错别字智能纠错,以改善中文病历质量,减少诊疗文书差错产生的概率。方法将统计语言模型和基于神经网的预训练模型相融合,进行中文病历文本错别字纠错的训练和验证,最终通过综合指标F1进行模型效果的评估。结果实验结果显示,融合模型的中文病历错别字纠错F1为0.6254,优于单统计语言模型和单预训练模型的F1值0.4813和0.5970。结论基于统计语言模型和预训练模型的融合方法,在中文病历文本错别字纠错方面有较好的效果,对临床病历书写质量的保障有一定的现实辅助意义。
Objective In order to improve the quality of Chinese medical records and reduce the probability of errors in medical documents,this study intends to carry out intelligent error correction for Chinese medical records.Methods Combinedwiththe statistical language model and neural network-based pre-train model,the ensemble model wasused to train and verify the error correction of Chinese medical records.The performance was evaluated by F1,a comprehensivemetric.Results The experimental results showed that F1 of the ensemble model was 0.6254,which was better than that of the single statistical language model and the single pre-train model,with F1 values of 0.4813 and 0.5970 respectively.Conclusion The fusion model based on statistical language model and pre-train model has a good effect in the error correction of Chinese medical record text,and has a certain practical significance for the quality guarantee of clinical medical record writing.
作者
姜会珍
焦雪莹
邹凌伟
许仕杰
朱卫国
JIANG Huizhen;JIAO Xueying;ZOU Lingwei;XU Shijie;ZHU Weiguo(Peking Union Medical College Hospital,Chinese Academy of Medical Sciences&Peking Union Medical College,Beijing100730,China)
出处
《中国卫生信息管理杂志》
2023年第3期448-453,共6页
Chinese Journal of Health Informatics and Management
基金
中国医学科学院医学与健康科技创新工程(项目编号:2021-I2M-1-056)。
关键词
自然语言处理
中文病历
错别字纠错
融合模型
natural language processing
Chinese medical record
error detection
model ensemble