Identifying hierarchically related entities is a critical step towards constructing bio-networks in the field of biomedical text mining. To this end, we adopt a mapping-based approach by first mapping bio-entities to ...Identifying hierarchically related entities is a critical step towards constructing bio-networks in the field of biomedical text mining. To this end, we adopt a mapping-based approach by first mapping bio-entities to terms in an established ontology Medical Subject Headings (MESH). We then utilize the hierarchical relationships available in MeSH to recognize hierarchically related entities. Specifically, we present two approaches to map biomedical entities identified using the Unified Medical Language System (UMLS) Metathesaurus to MeSH terms. The first approach utilizes a special feature provided by the MetaMap algorithm, whereas the other employs approximate phrase-based match to directly map entities to MeSH terms. These two approaches deliver comparable results with an accuracy of 72% and 75%, respectively, based on two evaluation datasets. A thorough error analysis demonstrates that these two approaches result in only around 10% mutual errors, indicating the complementary nature of these two approaches.展开更多
文摘Identifying hierarchically related entities is a critical step towards constructing bio-networks in the field of biomedical text mining. To this end, we adopt a mapping-based approach by first mapping bio-entities to terms in an established ontology Medical Subject Headings (MESH). We then utilize the hierarchical relationships available in MeSH to recognize hierarchically related entities. Specifically, we present two approaches to map biomedical entities identified using the Unified Medical Language System (UMLS) Metathesaurus to MeSH terms. The first approach utilizes a special feature provided by the MetaMap algorithm, whereas the other employs approximate phrase-based match to directly map entities to MeSH terms. These two approaches deliver comparable results with an accuracy of 72% and 75%, respectively, based on two evaluation datasets. A thorough error analysis demonstrates that these two approaches result in only around 10% mutual errors, indicating the complementary nature of these two approaches.