Mapping Chinese Medical Entities to the Unified Medical Language System

导出

摘要 Background.Chinese medical entities have not been organized comprehensively due to the lack of welldeveloped terminology systems,which poses a challenge to processing Chinese medical texts for finegrained medical knowledge representation.To unify Chinese medical terminologies,mapping Chinese medical entities to their English counterparts in the Unified Medical Language System(UMLS)is an efficient solution.However,their mappings have not been investigated sufficiently in former research.In this study,we explore strategies for mapping Chinese medical entities to the UMLS and systematically evaluate the mapping performance.Methods.First,Chinese medical entities are translated to English using multiple web-based translation engines.Then,3 mapping strategies are investigated:(a)stringbased,(b)semantic-based,and(c)string and semantic similarity combined.In addition,cross-lingual pretrained language models are applied to map Chinese medical entities to UMLS concepts without translation.All of these strategies are evaluated on the ICD10-CN,Chinese Human Phenotype Ontology(CHPO),and RealWorld datasets.Results.The linear combination method based on the SapBERT and term frequency-inverse document frequency bag-of-words models perform the best on all evaluation datasets,with 91.85%,82.44%,and 78.43%of the top 5 accuracies on the ICD10-CN,CHPO,and RealWorld datasets,respectively.Conclusions.In our study,we explore strategies for mapping Chinese medical entities to the UMLS and identify a satisfactory linear combination method.Our investigation will facilitate Chinese medical entity normalization and inspire research that focuses on Chinese medical ontology development.

作者 Luming Chen Yifan Qi Aiping Wu Lizong Deng Taijiao Jiang

机构地区 Guangzhou Laboratory Guangzhou Medical University Instituteof Systems Medicine Suzhou Institute of Systems Medicine

出处《Health Data Science》 2023年第1期31-40,共10页 健康数据科学（英文）

基金 the National Key Research and Development Program of China(2021YFC2302001) the CAMS Innovation Fund for Medical Sciences(CIFMS)(2021-1-I2M-051 and 2021-I2M-1-001) the National Natural Science Foundation of China(grant 31671371) the Emergency Key Program of Guangzhou Laboratory(grant EKPG21-12)。

关键词 SEMANTIC STRING TERMINOLOGY

分类号 H31 [语言文字—英语]

引文网络
相关文献

参考文献3

1George Mastorakos,Aditya Khurana,Ming Huang,Sunyang Fu,Ahmad PTafti,Jungwei Fan,Hongfang Liu.Probing Patient Messages Enhanced by Natural Language Processing:A Top-Down Message Corpus Analysis[J].Health Data Science,2021(1):1-10. 被引量：1
2Jun Chen,Chao Lu,Haifeng Huang,Dongwei Zhu,Qing Yang,Junwei Liu,Yan Huang,Aijun Deng,Xiaoxu Han.Cognitive Computing-Based CDSS in Medical Practice[J].Health Data Science,2021(1):133-145. 被引量：2
3郭靖文,杨晟,史涪仁,邵晨,张璐璐,王恒,杨啸林.MedPortal:面向精准医学的生物医学本体资源存储和应用平台[J].中国生物医学工程学报,2017,36(5):557-564. 被引量：8

二级参考文献3

1贺林.新医学是解决人类健康问题的真正钥匙——需“精准”理解奥巴马的“精准医学计划”[J].遗传,2015,37(6):613-614. 被引量：17
2Shengfeng Liu,Yi Wang,Xin Yang,Baiying Lei,Li Liu,Shawn Xiang Li,Dong Ni,Tianfu Wang.Deep Learning in Medical Ultrasound Analysis: A Review[J].Engineering,2019,5(2):261-275. 被引量：34
3竺可青,章锁江.50年尸体解剖资料分析[J].中华内科杂志,2004,43(2):128-130. 被引量：24

共引文献8

1闫淼佳,陈之浩,吴立晨,徐坤,赵芃,王予童,刘卉萌,颜虹,党少农,米白冰.基于REDCap系统快速搭建医学本体资源库及其应用[J].中国数字医学,2021,16(12):38-42. 被引量：2
2Pan Hongjie,Zhu Yan,Yang Sheng,Wang Zhigang,Zhou Wei,He Yongqun,Yang Xiaolin.Biomedical ontologies and their development,management,and applications in and beyond China[J].Journal of Bio-X Research,2019,2(4):178-184. 被引量：5
3杨啸林,杨晟,潘虹洁,王哲,王志刚,何勇群.FAIR准则与生物医学数据标准应用服务[J].中国医学伦理学,2020,33(2):153-159. 被引量：10
4孙慧娟,陈超,朱彦,马雅銮.浅谈人类表型本体及其在中医领域的应用前景[J].中华中医药杂志,2020,35(3):1086-1090. 被引量：3
5朱彦,郑捷,李晓瑛,杨啸林,何勇群.基本形式化本体及其中文版介绍[J].医学信息学杂志,2021,42(1):24-28. 被引量：7
6Yan Zhu,Keyu Yao,Suyuan Peng,Xiaolin Yang.Traditional Chinese Medicine(TCM)Domain Ontology:Current Status and Rethinking for the Future Development[J].Chinese Medical Sciences Journal,2022,37(3):228-233. 被引量：1
7王哲,杨晟,朱彦,杨啸林.本体构建工具Py2ONTO的设计与实现[J].中华医学图书情报杂志,2022,31(9):42-50.
8杨照,朱树宏,吕继成,郑茜子,惠淼,徐菱忆,杨莉.北京市某区医疗机构全民健康信息平台数据质量分析研究[J].中华医学科研管理杂志,2023,36(6):465-468. 被引量：1

1Ayesha Khaliq,Salman Afsar Awan,Fahad Ahmad,Muhammad Azam Zia,Muhammad Zafar Iqbal.Enhanced Topic-Aware Summarization Using Statistical Graph Neural Networks[J].Computers, Materials & Continua,2024,80(8):3221-3242.
2杨书鸿,牛玥,刘力铭.融合外部知识和图卷积神经网络的生物医学事件联合识别[J].科学技术与工程,2024,24(22):9464-9473.
3Hongru GAO,Xiaofei LIAO,Zhiyuan SHAO,Kexin LI,Jiajie CHEN,Hai JIN.A survey on dynamic graph processing on GPUs: concepts, terminologies and systems[J].Frontiers of Computer Science,2024,18(4):1-23.
4Partha Pratim Ray.ChatGPT in transforming communication in seismic engineering: Case studies, implications, key challenges and future directions[J].Earthquake Science,2024,37(4):352-367.
5张懿,曾国荪.面向英语阅读测试的采用摘要和句法技术的首个提问生成方法[J].计算机科学与应用,2024,14(8):207-220.
6刘寅.高职院校学生评教现状及改进探究[J].教育进展,2024,14(8):768-773.
7Jean Marie Rodrigues,Constant Kone,Michel Babri,Béatrice Trombert.International Healthcare Terminologies for Morbidity New Era: SNOMED and ICD11[J].Journal of Biosciences and Medicines,2024,12(8):357-368.

Health Data Science

2023年第1期

浏览历史

内容加载中请稍等...

Mapping Chinese Medical Entities to the Unified Medical Language System

参考文献3

二级参考文献3

共引文献8

相关作者

相关机构

相关主题

浏览历史