Temporal information is pervasive and crucial in medical records and other clinical text,as it formulates the development process of medical conditions and is vital for clinical decision making.However,providing a hol...Temporal information is pervasive and crucial in medical records and other clinical text,as it formulates the development process of medical conditions and is vital for clinical decision making.However,providing a holistic knowledge representation and reasoning framework for various time expressions in the clinical text is challenging.In order to capture complex temporal semantics in clinical text,we propose a novel Clinical Time Ontology(CTO)as an extension from OWL framework.More specifically,we identified eight timerelated problems in clinical text and created 11 core temporal classes to conceptualize the fuzzy time,cyclic time,irregular time,negations and other complex aspects of clinical time.Then,we extended Allen’s and TEO’s temporal relations and defined the relation concept description between complex and simple time.Simultaneously,we provided a formulaic and graphical presentation of complex time and complex time relationships.We carried out empirical study on the expressiveness and usability of CTO using real-world healthcare datasets.Finally,experiment results demonstrate that CTO could faithfully represent and reason over 93%of the temporal expressions,and it can cover a wider range of time-related classes in clinical domain.展开更多
The extraction of features fromunstructured clinical data of Covid-19 patients is critical for guiding clinical decision-making and diagnosing this viral disease.Furthermore,an early and accurate diagnosis of COVID-19...The extraction of features fromunstructured clinical data of Covid-19 patients is critical for guiding clinical decision-making and diagnosing this viral disease.Furthermore,an early and accurate diagnosis of COVID-19 can reduce the burden on healthcare systems.In this paper,an improved Term Weighting technique combined with Parts-Of-Speech(POS)Tagging is proposed to reduce dimensions for automatic and effective classification of clinical text related to Covid-19 disease.Term Frequency-Inverse Document Frequency(TF-IDF)is the most often used term weighting scheme(TWS).However,TF-IDF has several developments to improve its drawbacks,in particular,it is not efficient enough to classify text by assigning effective weights to the terms in unstructured data.In this research,we proposed a modification term weighting scheme:RTF-C-IEF and compare the proposed model with four extraction methods:TF,TF-IDF,TF-IHF,and TF-IEF.The experiment was conducted on two new datasets for COVID-19 patients.The first datasetwas collected from government hospitals in Iraq with 3053 clinical records,and the second dataset with 1446 clinical reports,was collected from several different websites.Based on the experimental results using several popular classifiers applied to the datasets of Covid-19,we observe that the proposed scheme RTF-C-IEF achieves is a consistent performer with the best scores in most of the experiments.Further,the modifiedRTF-C-IEF proposed in the study outperformed the original scheme and other employed term weighting methods in most experiments.Thus,the proper selection of term weighting scheme among the different methods improves the performance of the classifier and helps to find the informative term.展开更多
The China Conference on Knowledge Graph and Semantic Computing(CCKS)2020 Evaluation Task 3 presented clinical named entity recognition and event extraction for the Chinese electronic medical records.Two annotated data...The China Conference on Knowledge Graph and Semantic Computing(CCKS)2020 Evaluation Task 3 presented clinical named entity recognition and event extraction for the Chinese electronic medical records.Two annotated data sets and some other additional resources for these two subtasks were provided for participators.This evaluation competition attracted 354 teams and 46 of them successfully submitted the valid results.The pre-trained language models are widely applied in this evaluation task.Data argumentation and external resources are also helpful.展开更多
基金supported by the National Natural Science Foundation of China(No.U1836118)the Open Fund of Key Laboratory of Content Organization and Knowledge Services for Rich Media Digital Publishing(ZD2021-11/01)the Natural Science Foundation of Hubei Province educational Committee(B2019009)
文摘Temporal information is pervasive and crucial in medical records and other clinical text,as it formulates the development process of medical conditions and is vital for clinical decision making.However,providing a holistic knowledge representation and reasoning framework for various time expressions in the clinical text is challenging.In order to capture complex temporal semantics in clinical text,we propose a novel Clinical Time Ontology(CTO)as an extension from OWL framework.More specifically,we identified eight timerelated problems in clinical text and created 11 core temporal classes to conceptualize the fuzzy time,cyclic time,irregular time,negations and other complex aspects of clinical time.Then,we extended Allen’s and TEO’s temporal relations and defined the relation concept description between complex and simple time.Simultaneously,we provided a formulaic and graphical presentation of complex time and complex time relationships.We carried out empirical study on the expressiveness and usability of CTO using real-world healthcare datasets.Finally,experiment results demonstrate that CTO could faithfully represent and reason over 93%of the temporal expressions,and it can cover a wider range of time-related classes in clinical domain.
文摘The extraction of features fromunstructured clinical data of Covid-19 patients is critical for guiding clinical decision-making and diagnosing this viral disease.Furthermore,an early and accurate diagnosis of COVID-19 can reduce the burden on healthcare systems.In this paper,an improved Term Weighting technique combined with Parts-Of-Speech(POS)Tagging is proposed to reduce dimensions for automatic and effective classification of clinical text related to Covid-19 disease.Term Frequency-Inverse Document Frequency(TF-IDF)is the most often used term weighting scheme(TWS).However,TF-IDF has several developments to improve its drawbacks,in particular,it is not efficient enough to classify text by assigning effective weights to the terms in unstructured data.In this research,we proposed a modification term weighting scheme:RTF-C-IEF and compare the proposed model with four extraction methods:TF,TF-IDF,TF-IHF,and TF-IEF.The experiment was conducted on two new datasets for COVID-19 patients.The first datasetwas collected from government hospitals in Iraq with 3053 clinical records,and the second dataset with 1446 clinical reports,was collected from several different websites.Based on the experimental results using several popular classifiers applied to the datasets of Covid-19,we observe that the proposed scheme RTF-C-IEF achieves is a consistent performer with the best scores in most of the experiments.Further,the modifiedRTF-C-IEF proposed in the study outperformed the original scheme and other employed term weighting methods in most experiments.Thus,the proper selection of term weighting scheme among the different methods improves the performance of the classifier and helps to find the informative term.
文摘The China Conference on Knowledge Graph and Semantic Computing(CCKS)2020 Evaluation Task 3 presented clinical named entity recognition and event extraction for the Chinese electronic medical records.Two annotated data sets and some other additional resources for these two subtasks were provided for participators.This evaluation competition attracted 354 teams and 46 of them successfully submitted the valid results.The pre-trained language models are widely applied in this evaluation task.Data argumentation and external resources are also helpful.