Recent advances in OCR show that end-to-end(E2E)training pipelines including detection and identification can achieve the best results.However,many existing methods usually focus on case insensitive English characters...Recent advances in OCR show that end-to-end(E2E)training pipelines including detection and identification can achieve the best results.However,many existing methods usually focus on case insensitive English characters.In this paper,we apply an E2E approach,the multiplex multilingual mask TextSpotter,which performs script recognition at the word level and uses different recognition headers to process different scripts while maintaining uniform loss,thus optimizing script recognition and multiple recognition headers simultaneously.Experiments show that this method is superior to the single-head model with similar number of parameters in endto-end identification tasks.展开更多
Language testing is very important and necessary,and moreover as we all know,nowadays,in English language testing, the muhiple-choice item is most widely used and many users regard the multiple-choice item as the most...Language testing is very important and necessary,and moreover as we all know,nowadays,in English language testing, the muhiple-choice item is most widely used and many users regard the multiple-choice item as the most flexible and probably the most effective of the objective item types.The multiple-choice item has its characteristics,advantages and disadvantages.We should bring out its strengths to make up for its weaknesses and use it appropriately.Although it has its limitations,it is suitable for large-scale tests and tests dealing with wide-range knowledge.We should correctly ap- ply testing principles and methods in order to make testing more effective and reliable.展开更多
In this paper we propose a multiple feature approach for the normalization task which can map each disorder mention in the text to a unique unified medical language system(UMLS)concept unique identifier(CUI). We d...In this paper we propose a multiple feature approach for the normalization task which can map each disorder mention in the text to a unique unified medical language system(UMLS)concept unique identifier(CUI). We develop a two-step method to acquire a list of candidate CUIs and their associated preferred names using UMLS API and to choose the closest CUI by calculating the similarity between the input disorder mention and each candidate. The similarity calculation step is formulated as a classification problem and multiple features(string features,ranking features,similarity features,and contextual features) are used to normalize the disorder mentions. The results show that the multiple feature approach improves the accuracy of the normalization task from 32.99% to 67.08% compared with the Meta Map baseline.展开更多
基金supported by the Advanced Training Project of the Professional Leaders in Jiangsu Higher Vocational Colleges (2020GRFX006).
文摘Recent advances in OCR show that end-to-end(E2E)training pipelines including detection and identification can achieve the best results.However,many existing methods usually focus on case insensitive English characters.In this paper,we apply an E2E approach,the multiplex multilingual mask TextSpotter,which performs script recognition at the word level and uses different recognition headers to process different scripts while maintaining uniform loss,thus optimizing script recognition and multiple recognition headers simultaneously.Experiments show that this method is superior to the single-head model with similar number of parameters in endto-end identification tasks.
文摘Language testing is very important and necessary,and moreover as we all know,nowadays,in English language testing, the muhiple-choice item is most widely used and many users regard the multiple-choice item as the most flexible and probably the most effective of the objective item types.The multiple-choice item has its characteristics,advantages and disadvantages.We should bring out its strengths to make up for its weaknesses and use it appropriately.Although it has its limitations,it is suitable for large-scale tests and tests dealing with wide-range knowledge.We should correctly ap- ply testing principles and methods in order to make testing more effective and reliable.
基金Supported by the National Natural Science Foundation of China(61133012,61202193,61373108)the Major Projects of the National Social Science Foundation of China(11&ZD189)+1 种基金the Chinese Postdoctoral Science Foundation(2013M540593,2014T70722)the Open Foundation of Shandong Key Laboratory of Language Resource Development and Application
文摘In this paper we propose a multiple feature approach for the normalization task which can map each disorder mention in the text to a unique unified medical language system(UMLS)concept unique identifier(CUI). We develop a two-step method to acquire a list of candidate CUIs and their associated preferred names using UMLS API and to choose the closest CUI by calculating the similarity between the input disorder mention and each candidate. The similarity calculation step is formulated as a classification problem and multiple features(string features,ranking features,similarity features,and contextual features) are used to normalize the disorder mentions. The results show that the multiple feature approach improves the accuracy of the normalization task from 32.99% to 67.08% compared with the Meta Map baseline.