To resolve the ontology understanding problem, the structural features and the potential important terms of a large-scale ontology are investigated from the perspective of complex networks analysis. Through the empiri...To resolve the ontology understanding problem, the structural features and the potential important terms of a large-scale ontology are investigated from the perspective of complex networks analysis. Through the empirical studies of the gene ontology with various perspectives, this paper shows that the whole gene ontology displays the same topological features as complex networks including "small world" and "scale-free",while some sub-ontologies have the "scale-free" property but no "small world" effect.The potential important terms in an ontology are discovered by some famous complex network centralization methods.An evaluation method based on information retrieval in MEDLINE is designed to measure the effectiveness of the discovered important terms.According to the relevant literature of the gene ontology terms,the suitability of these centralization methods for ontology important concepts discovering is quantitatively evaluated.The experimental results indicate that the betweenness centrality is the most appropriate method among all the evaluated centralization measures.展开更多
基金The National Basic Research Program of China (973Program) (No.2005CB321802)Program for New Century Excellent Talents in University (No.NCET-06-0926)the National Natural Science Foundation of China (No.60873097,90612009)
文摘To resolve the ontology understanding problem, the structural features and the potential important terms of a large-scale ontology are investigated from the perspective of complex networks analysis. Through the empirical studies of the gene ontology with various perspectives, this paper shows that the whole gene ontology displays the same topological features as complex networks including "small world" and "scale-free",while some sub-ontologies have the "scale-free" property but no "small world" effect.The potential important terms in an ontology are discovered by some famous complex network centralization methods.An evaluation method based on information retrieval in MEDLINE is designed to measure the effectiveness of the discovered important terms.According to the relevant literature of the gene ontology terms,the suitability of these centralization methods for ontology important concepts discovering is quantitatively evaluated.The experimental results indicate that the betweenness centrality is the most appropriate method among all the evaluated centralization measures.