In this paper we propose a multiple feature approach for the normalization task which can map each disorder mention in the text to a unique unified medical language system(UMLS)concept unique identifier(CUI). We d...In this paper we propose a multiple feature approach for the normalization task which can map each disorder mention in the text to a unique unified medical language system(UMLS)concept unique identifier(CUI). We develop a two-step method to acquire a list of candidate CUIs and their associated preferred names using UMLS API and to choose the closest CUI by calculating the similarity between the input disorder mention and each candidate. The similarity calculation step is formulated as a classification problem and multiple features(string features,ranking features,similarity features,and contextual features) are used to normalize the disorder mentions. The results show that the multiple feature approach improves the accuracy of the normalization task from 32.99% to 67.08% compared with the Meta Map baseline.展开更多
Nonnegative Matrix Factorization(NMF)is one of the most popular feature learning technologies in the field of machine learning and pattern recognition.It has been widely used and studied in the multi-view clustering t...Nonnegative Matrix Factorization(NMF)is one of the most popular feature learning technologies in the field of machine learning and pattern recognition.It has been widely used and studied in the multi-view clustering tasks because of its effectiveness.This study proposes a general semi-supervised multi-view nonnegative matrix factorization algorithm.This algorithm incorporates discriminative and geometric information on data to learn a better-fused representation,and adopts a feature normalizing strategy to align the different views.Two specific implementations of this algorithm are developed to validate the effectiveness of the proposed framework:Graph regularization based Discriminatively Constrained Multi-View Nonnegative Matrix Factorization(GDCMVNMF)and Extended Multi-View Constrained Nonnegative Matrix Factorization(ExMVCNMF).The intrinsic connection between these two specific implementations is discussed,and the optimization based on multiply update rules is presented.Experiments on six datasets show that the effectiveness of GDCMVNMF and ExMVCNMF outperforms several representative unsupervised and semi-supervised multi-view NMF approaches.展开更多
基金Supported by the National Natural Science Foundation of China(61133012,61202193,61373108)the Major Projects of the National Social Science Foundation of China(11&ZD189)+1 种基金the Chinese Postdoctoral Science Foundation(2013M540593,2014T70722)the Open Foundation of Shandong Key Laboratory of Language Resource Development and Application
文摘In this paper we propose a multiple feature approach for the normalization task which can map each disorder mention in the text to a unique unified medical language system(UMLS)concept unique identifier(CUI). We develop a two-step method to acquire a list of candidate CUIs and their associated preferred names using UMLS API and to choose the closest CUI by calculating the similarity between the input disorder mention and each candidate. The similarity calculation step is formulated as a classification problem and multiple features(string features,ranking features,similarity features,and contextual features) are used to normalize the disorder mentions. The results show that the multiple feature approach improves the accuracy of the normalization task from 32.99% to 67.08% compared with the Meta Map baseline.
基金This work was supported by the National Key Research and Development Project of China(No.2019YFB2102500)the Strategic Priority CAS Project(No.XDB38040200)+2 种基金the National Natural Science Foundation of China(Nos.62206269,U1913210)the Guangdong Provincial Science and Technology Projects(Nos.2022A1515011217,2022A1515011557)the Shenzhen Science and Technology Projects(No.JSGG20211029095546003)。
文摘Nonnegative Matrix Factorization(NMF)is one of the most popular feature learning technologies in the field of machine learning and pattern recognition.It has been widely used and studied in the multi-view clustering tasks because of its effectiveness.This study proposes a general semi-supervised multi-view nonnegative matrix factorization algorithm.This algorithm incorporates discriminative and geometric information on data to learn a better-fused representation,and adopts a feature normalizing strategy to align the different views.Two specific implementations of this algorithm are developed to validate the effectiveness of the proposed framework:Graph regularization based Discriminatively Constrained Multi-View Nonnegative Matrix Factorization(GDCMVNMF)and Extended Multi-View Constrained Nonnegative Matrix Factorization(ExMVCNMF).The intrinsic connection between these two specific implementations is discussed,and the optimization based on multiply update rules is presented.Experiments on six datasets show that the effectiveness of GDCMVNMF and ExMVCNMF outperforms several representative unsupervised and semi-supervised multi-view NMF approaches.