针对国际疾病分类(ICD)自动编码方法的长文本处理、编码的层次结构以及长尾分布等导致的模型泛化能力弱的问题,提出一种充分利用医学预训练语言模型的基于提示学习和超球原型的小样本ICD自动编码方法(hypersphere prototypical with pro...针对国际疾病分类(ICD)自动编码方法的长文本处理、编码的层次结构以及长尾分布等导致的模型泛化能力弱的问题,提出一种充分利用医学预训练语言模型的基于提示学习和超球原型的小样本ICD自动编码方法(hypersphere prototypical with prompt learning,PromptHP)。首先,将编码描述与临床文本融合进提示学习模型中的提示模板,使得模型能够更加深入地理解临床文本;然后,充分利用预训练语言模型的先验知识进行初始预测;接着,在预训练语言模型输出表示的基础上引入超球原型进行类别建模和度量分类,并在医学数据集上微调网络,充分纳入数据知识,提高模型在小样本ICD编码分配任务上的性能;最后,对以上两部分预测结果集成加权获得最终编码预测结果。在公开医学数据集MIMIC-Ⅲ上的实验结果表明,该模型优于最先进的基线方法,PromptHP将小样本编码的macro-AUC、micro-AUC、macro-F_(1)和micro-F_(1)分别提高了1.77%、1.54%、14.22%、15.01%。实验结果验证了该模型在小样本编码分类任务中的有效性。展开更多
The Information Technology (IT) developments have changed the use of Healthcare terminologies from paper-based mortality statistics with the WHO international classifications of diseases (ICD) to the IT-based morbidit...The Information Technology (IT) developments have changed the use of Healthcare terminologies from paper-based mortality statistics with the WHO international classifications of diseases (ICD) to the IT-based morbidity implementations for instance for Casemix-based healthcare funding and managing systems. This higher level of granularity is worldwide spread under the umbrella of several national modifications named ICD10 XM. These developments have met the increased use of the International Clinical Reference Terminology named SNOMED. When the updating of WHO ICD10 to WHO ICD11 was decided a merging was envisaged and a WHO SNOMED CT common work proposed a methodology to create a common formal ontology between the 11th version of the WHO International Classification of Diseases and Health Problems (ICD) and the most used in the world clinical terminology named Systematized Nomenclature of Human and Veterinary Medicine - Clinical Terms (SCT). The present work follows this unachieved work and aims to develop a SNOMED-based formal ontology for ICD11 chapter 1 using the textual definitions of ICD11 codes which is a completely new character of ICD and the ontology tools provided by SCT in the publicly available SNOMED Browser. There are two key results: the lexical alignment is complete and the ontology alignment is incomplete with the validated SNOMED concept model can be completed with not yet validated attributes and values of the SNOMED Compositional Grammar. The work opens a new era for the seamless use of both international terminologies for morbidity for instance for DRG/Casemix and clinical management use. The main limitation is that it is restricted to 1 out of 26 chapters of ICD11.展开更多
目的分析使用国际疾病分类编码(International Classification of Diseases,ICD)对疾病进行标准分类准确性的影响因素并研究适当的处理对策。方法回顾性选取2021年1月—2023年12月河池市第三人民医院收治的1440例患者的临床资料,进行正...目的分析使用国际疾病分类编码(International Classification of Diseases,ICD)对疾病进行标准分类准确性的影响因素并研究适当的处理对策。方法回顾性选取2021年1月—2023年12月河池市第三人民医院收治的1440例患者的临床资料,进行正规的编码流程并加以核对,分析导致编码错误的具体原因以及解决办法。结果73份病案编码错误中主要诊断选择错误12份,疑难诊断编码错误14份,未按病理报告编码18份,未按照合并编码原则10份,合并症未编码19份。73名病案编码错误科室分布中骨科15份,心血管内科18份,神经内科20份,泌尿外科10份,儿科8份,产科2份。73份病案编码错误原因中病历书写不规范25份(34.25%),主导词错误22份(30.14%),编码员业务水平不佳11份(15.07%),特殊疾病特殊编码6份(8.22%),其他9份(12.33%)。结论医院病案疾病诊断ICD编码准确性受到病历书写不规范、主导词错误、编码员业务水平不佳的影响,需提升编码员专业技术水平,减少工作中的差错,不断改进编码质量。展开更多
文摘针对国际疾病分类(ICD)自动编码方法的长文本处理、编码的层次结构以及长尾分布等导致的模型泛化能力弱的问题,提出一种充分利用医学预训练语言模型的基于提示学习和超球原型的小样本ICD自动编码方法(hypersphere prototypical with prompt learning,PromptHP)。首先,将编码描述与临床文本融合进提示学习模型中的提示模板,使得模型能够更加深入地理解临床文本;然后,充分利用预训练语言模型的先验知识进行初始预测;接着,在预训练语言模型输出表示的基础上引入超球原型进行类别建模和度量分类,并在医学数据集上微调网络,充分纳入数据知识,提高模型在小样本ICD编码分配任务上的性能;最后,对以上两部分预测结果集成加权获得最终编码预测结果。在公开医学数据集MIMIC-Ⅲ上的实验结果表明,该模型优于最先进的基线方法,PromptHP将小样本编码的macro-AUC、micro-AUC、macro-F_(1)和micro-F_(1)分别提高了1.77%、1.54%、14.22%、15.01%。实验结果验证了该模型在小样本编码分类任务中的有效性。
文摘The Information Technology (IT) developments have changed the use of Healthcare terminologies from paper-based mortality statistics with the WHO international classifications of diseases (ICD) to the IT-based morbidity implementations for instance for Casemix-based healthcare funding and managing systems. This higher level of granularity is worldwide spread under the umbrella of several national modifications named ICD10 XM. These developments have met the increased use of the International Clinical Reference Terminology named SNOMED. When the updating of WHO ICD10 to WHO ICD11 was decided a merging was envisaged and a WHO SNOMED CT common work proposed a methodology to create a common formal ontology between the 11th version of the WHO International Classification of Diseases and Health Problems (ICD) and the most used in the world clinical terminology named Systematized Nomenclature of Human and Veterinary Medicine - Clinical Terms (SCT). The present work follows this unachieved work and aims to develop a SNOMED-based formal ontology for ICD11 chapter 1 using the textual definitions of ICD11 codes which is a completely new character of ICD and the ontology tools provided by SCT in the publicly available SNOMED Browser. There are two key results: the lexical alignment is complete and the ontology alignment is incomplete with the validated SNOMED concept model can be completed with not yet validated attributes and values of the SNOMED Compositional Grammar. The work opens a new era for the seamless use of both international terminologies for morbidity for instance for DRG/Casemix and clinical management use. The main limitation is that it is restricted to 1 out of 26 chapters of ICD11.
文摘目的分析使用国际疾病分类编码(International Classification of Diseases,ICD)对疾病进行标准分类准确性的影响因素并研究适当的处理对策。方法回顾性选取2021年1月—2023年12月河池市第三人民医院收治的1440例患者的临床资料,进行正规的编码流程并加以核对,分析导致编码错误的具体原因以及解决办法。结果73份病案编码错误中主要诊断选择错误12份,疑难诊断编码错误14份,未按病理报告编码18份,未按照合并编码原则10份,合并症未编码19份。73名病案编码错误科室分布中骨科15份,心血管内科18份,神经内科20份,泌尿外科10份,儿科8份,产科2份。73份病案编码错误原因中病历书写不规范25份(34.25%),主导词错误22份(30.14%),编码员业务水平不佳11份(15.07%),特殊疾病特殊编码6份(8.22%),其他9份(12.33%)。结论医院病案疾病诊断ICD编码准确性受到病历书写不规范、主导词错误、编码员业务水平不佳的影响,需提升编码员专业技术水平,减少工作中的差错,不断改进编码质量。