摘要
Electronic Medical Records(EMR) with unstructured sentences and various conceptual expressions provide rich information for medical information extraction. However, common Named Entity Recognition(NER)in Natural Language Processing(NLP) are not well suitable for clinical NER in EMR. This study aims at applying neural networks to clinical concept extractions. We integrate Bidirectional Long Short-Term Memory Networks(Bi-LSTM) with a Conditional Random Fields(CRF) layer to detect three types of clinical named entities. Word representations fed into the neural networks are concatenated by character-based word embeddings and Continuous Bag of Words(CBOW) embeddings trained both on domain and non-domain corpus. We test our NER system on i2b2/VA open datasets and compare the performance with six related works, achieving the best result of NER with F1 value 0.853 7. We also point out a few specific problems in clinical concept extractions which will give some hints to deeper studies.
Electronic Medical Records (EMR) with unstructured sentences and various conceptual expressions provide rich information for medical information extraction. However, common Named Entity Recognition (NER) in Natural Language Processing (NLP) are not well suitable for clinical NER in EMR. This study aims at applying neural networks to clinical concept extractions. We integrate Bidirectional Long Short-Term Memory Networks (Bi-LSTM) with a Conditional Random Fields (CRF) layer to detect three types of clinical named entities. Word representations fed into the neural networks are concatenated by character-based word embeddings and Contin- uous Bag of Words (CBOW) embeddings trained both on domain and non-domain corpus. We test our NER system on i2b2/VA open datasets and compare the performance with six related works, achieving the best result of NER with F1 value 0.853 7. We also point out a few specific problems in clinical concept extractions which will give some hints to deeper studies.
基金
the National Social Science Foundation of China(No.17BYY047)