摘要
在应对中文电子病历文本分析时,面临着一词多义、识别不完整等挑战。为此,构建了RoBERTa-WWM模型与BiLSTM-CRF模块相结合的深度学习框架。首先,将经过预训练的RoBERTa-WWM语言模型与Transformer层产生的语义特征进行深度融合,以捕获文本的复杂语境信息。接着,将融合后的语义表示输入至BiLSTM以及CRF模块,进一步细化了实体的辨识范围与准确性。最后,在CCKS2019数据集上进行了实证分析,F1值高达82.94%。这一数据有力地证实了RoBERTa-WWM-BiLSTM-CRF模型在中文电子病历命名实体的识别工作上的优越性能。
When dealing with the text analysis of Chinesc electronic medical records,we are faced with the challenges of polysemy and incomplete recognition.Therefore,a deep learning framework combining RoBERTa-WWM model and BiLSTM-CRF module is constructed.First,the pre-trained RoBERTa-WWM language model is dceply integrated with the semantic features generated by the Transformer layer to capture complex contextual information of the text.Then,the fusion semantic representation is input into BiLSTM and CRF modules to further refine the identification range and accuracy of entities.Finally,an cmpirical analysis was carried out on the CCKS2019 datasect,and the value was as high as 82.94%.This data strongly confirms the superior performance of RoBERTa-WWM-BiLSTM-CRF model in the recognition of named entities in Chinese electronic medical records.
作者
刘慧敏
黄霞
熊菲
王国庆
LIU Huimin;HUANG Xia;XIONG Fei;WANG Guoqing(Haiyuan College,Kunming Medical University,Kunming 650000,China)
出处
《长江信息通信》
2024年第3期7-9,共3页
Changjiang Information & Communications
基金
昆明医科大学海源学院科学研究基金项目《基于自然语言处理技术的中文命名实体识别研究》(项目编号:2022HY014)。