To improve the performance of the ontology matching process, a more efficient ontology matching algorithm, which can effectively eliminate unnecessary operations of matching entities, is proposed. By the theoretical a...To improve the performance of the ontology matching process, a more efficient ontology matching algorithm, which can effectively eliminate unnecessary operations of matching entities, is proposed. By the theoretical analysis and proof, a set of matching rules are summarized for depicting inherent relations among matching results of entities. Based on these rules, the proposed algorithm can reuse the matching results of two entities to directly determine the matching results of their adjacent entities. Thereby, redundant operations of matching adjacent entities can be avoided, which can improve the performance of the whole matching process. The experimental results show that, compared with related algorithms, the proposed algorithm has high matching accuracy and can remarkably reduce the consuming time of the whole matching process. So, the proposed algorithm is more competent for the large-scale ontology matching which often occurs in the practical heterogeneous web resources integration project.展开更多
On the semantic web, data interoperability and ontology heterogeneity are becoming ever more important issues. To resolve these problems, multiple classification methods can be used to learn the matching between ontol...On the semantic web, data interoperability and ontology heterogeneity are becoming ever more important issues. To resolve these problems, multiple classification methods can be used to learn the matching between ontologies. The paper uses the general statistic classification method to discover category features in data instances and use the first-order learning algorithm FOIL to exploit the semantic relations among data instances. When using multistrategy learning approach, a central problem is the evaluation of multistrategy classifiers. The goal and the conditions of using multistrategy classifiers within ontology matching are different from the ones for general text classification. This paper describes the combination rule of multiple classifiers called the Best Outstanding Champion, which is suitable for heterogeneous ontology mapping. On the prediction results of individual methods, the method can well accumulate the correct matching of alone classifier. The experiments show that the approach achieves high accuracy on real-world domain.展开更多
Many ontologies have been published on the Semantic Web,to be shared to describe resources.Among them,large ontologies of real-world areas have the scalability problem in presenting semantic technologies such as ontol...Many ontologies have been published on the Semantic Web,to be shared to describe resources.Among them,large ontologies of real-world areas have the scalability problem in presenting semantic technologies such as ontology matching(OM).This either suffers from too long run time or has strong hypotheses on the running environment.To deal with this issue,we propose a three-stage MapReduce-based approach V-Doc+ for matching large ontologies,based on the MapReduce framework and virtual document technique.Specifically,two MapReduce processes are performed in the first stage to extract the textual descriptions of named entities(classes,properties,and instances) and blank nodes,respectively.In the second stage,the extracted descriptions are exchanged with neighbors in Resource Description Framework(RDF) graphs to construct virtual documents.This extraction process also benefits from the MapReduce-based implementation.A word-weight-based partitioning method is proposed in the third stage to conduct parallel similarity calculation using the term frequency-inverse document frequency(TF-IDF) model.Experimental results on two large-scale real datasets and the benchmark testbed from Ontology Alignment Evaluation Initiative(OAEI) are reported,showing that the proposed approach significantly reduces the run time with minor loss in precision and recall.展开更多
An element may have heterogeneous semantic interpretations in different ontologies. Therefore, understanding the real local meanings of elements is very useful for ontology operations such as querying and reasoning, w...An element may have heterogeneous semantic interpretations in different ontologies. Therefore, understanding the real local meanings of elements is very useful for ontology operations such as querying and reasoning, which are the foundations for many applications including semantic searching, ontology matching, and linked data analysis. However, since different ontologies have different preferences to describe their elements, obtaining the semantic context of an element is an open problem. A semantic subgraph was proposed to capture the real meanings of ontology elements. To extract the semantic subgraphs, a hybrid ontology graph is used to represent the semantic relations between elements. An extracting algorithm based on an electrical circuit model is then used with new conductivity calculation rules to improve the quality of the semantic subgraphs. The evaluation results show that the semantic subgraphs properly capture the local meanings of elements. Ontology matching based on semantic subgraphs also demonstrates that the semantic subgraph is a promising technique for ontology applications.展开更多
Ontology alignment has been studied for over a decade,and over that time many alignment systems and methods have been developed by researchers in order to find simple 1-to-1 equivalence matches between two ontologies....Ontology alignment has been studied for over a decade,and over that time many alignment systems and methods have been developed by researchers in order to find simple 1-to-1 equivalence matches between two ontologies.However,very few alignment systems focus on finding complex correspondences.One reason for this limitation may be that there are no widely accepted alignment benchmarks that contain such complex relationships.In this paper,we propose a real-world data set from the GeoLink project as a potential complex ontology alignment benchmark.The data set consists of two ontologies,the GeoLink Base Ontology(GBO)and the GeoLink Modular Ontology(GMO),as well as a manually created reference alignment that was developed in consultation with domain experts from different institutions.The alignment includes 1:1,1:n,and m:n equivalence and subsumption correspondences,and is available in both Expressive and Declarative Ontology Alignment Language(EDOAL)and rule syntax.The benchmark has been expanded from its original version to contain real-world instance data from seven geoscience data providers that has been published according to both ontologies.This allows it to be used by extensional alignment systems or those that require training data.This benchmark has been incorporated into the Ontology Alignment Evaluation Initiative(OAEI)complex track to help researchers test their automated alignment systems and algorithms.This paper also analyzes the challenges inherent in effectively generating,detecting,and evaluating complex ontology alignments and provides a road map for future work on this topic.展开更多
Ontology occupies an important position in artificial intelligence,computer linguistics and knowledge management.However,when different ontologies are constructed to represent the same information in a domain,the so-c...Ontology occupies an important position in artificial intelligence,computer linguistics and knowledge management.However,when different ontologies are constructed to represent the same information in a domain,the so-called heterogeneity problem arises.In order to address this problem,a key task is to discover the semantic relationship of entities between given two ontologies,called ontology alignment.Recently,the meta-heuristic algorithms have already been regarded as an effective approach for solving ontology alignment problem.However,firstly,as the ontologies become increasingly large,meta-heuristic algorithms may be easier to find local optimal alignment in large search spaces.Secondly,many existing approaches exploit the population-based meta-heuristic algorithms so that the massive calculation is required.In this paper,an improved compact particle swarm algorithm by using a local search strategy is proposed,called LSCPSOA,to improve the performance of finding more correct correspondences.In LSCPSOA,two update strategies with local search capability are employed to avoid falling into a local optimal alignment.The proposed algorithm has been evaluated on several large ontology data sets and compared with existing ontology alignment methods.The experimental results show that the proposed algorithm can find more correct correspondences and improves the time performance compared with other meta-heuristic algorithms.展开更多
基金R & D Infrastructure and Facility Development(No2005DKA64201)the National High Technology Research and De-velopment Program of China (863Program) (No2006AA12Z202)
文摘To improve the performance of the ontology matching process, a more efficient ontology matching algorithm, which can effectively eliminate unnecessary operations of matching entities, is proposed. By the theoretical analysis and proof, a set of matching rules are summarized for depicting inherent relations among matching results of entities. Based on these rules, the proposed algorithm can reuse the matching results of two entities to directly determine the matching results of their adjacent entities. Thereby, redundant operations of matching adjacent entities can be avoided, which can improve the performance of the whole matching process. The experimental results show that, compared with related algorithms, the proposed algorithm has high matching accuracy and can remarkably reduce the consuming time of the whole matching process. So, the proposed algorithm is more competent for the large-scale ontology matching which often occurs in the practical heterogeneous web resources integration project.
文摘On the semantic web, data interoperability and ontology heterogeneity are becoming ever more important issues. To resolve these problems, multiple classification methods can be used to learn the matching between ontologies. The paper uses the general statistic classification method to discover category features in data instances and use the first-order learning algorithm FOIL to exploit the semantic relations among data instances. When using multistrategy learning approach, a central problem is the evaluation of multistrategy classifiers. The goal and the conditions of using multistrategy classifiers within ontology matching are different from the ones for general text classification. This paper describes the combination rule of multiple classifiers called the Best Outstanding Champion, which is suitable for heterogeneous ontology mapping. On the prediction results of individual methods, the method can well accumulate the correct matching of alone classifier. The experiments show that the approach achieves high accuracy on real-world domain.
基金supported by the National Natural Science Foundation of China (No.61003018)the Natural Science Foundation of Jiangsu Province,China (No.BK2011189)the National Social Science Foundation of China (No.11AZD121)
文摘Many ontologies have been published on the Semantic Web,to be shared to describe resources.Among them,large ontologies of real-world areas have the scalability problem in presenting semantic technologies such as ontology matching(OM).This either suffers from too long run time or has strong hypotheses on the running environment.To deal with this issue,we propose a three-stage MapReduce-based approach V-Doc+ for matching large ontologies,based on the MapReduce framework and virtual document technique.Specifically,two MapReduce processes are performed in the first stage to extract the textual descriptions of named entities(classes,properties,and instances) and blank nodes,respectively.In the second stage,the extracted descriptions are exchanged with neighbors in Resource Description Framework(RDF) graphs to construct virtual documents.This extraction process also benefits from the MapReduce-based implementation.A word-weight-based partitioning method is proposed in the third stage to conduct parallel similarity calculation using the term frequency-inverse document frequency(TF-IDF) model.Experimental results on two large-scale real datasets and the benchmark testbed from Ontology Alignment Evaluation Initiative(OAEI) are reported,showing that the proposed approach significantly reduces the run time with minor loss in precision and recall.
基金Supported by the National High-Tech Research and Development (863) Program of China (No.2009AA01Z147)the National Natural Science Foundation of China (Nos.61003156 and 90818027)the National Key Basic Research and Development (973) Program of China (No.2009CB320703)
文摘An element may have heterogeneous semantic interpretations in different ontologies. Therefore, understanding the real local meanings of elements is very useful for ontology operations such as querying and reasoning, which are the foundations for many applications including semantic searching, ontology matching, and linked data analysis. However, since different ontologies have different preferences to describe their elements, obtaining the semantic context of an element is an open problem. A semantic subgraph was proposed to capture the real meanings of ontology elements. To extract the semantic subgraphs, a hybrid ontology graph is used to represent the semantic relations between elements. An extracting algorithm based on an electrical circuit model is then used with new conductivity calculation rules to improve the quality of the semantic subgraphs. The evaluation results show that the semantic subgraphs properly capture the local meanings of elements. Ontology matching based on semantic subgraphs also demonstrates that the semantic subgraph is a promising technique for ontology applications.
文摘Ontology alignment has been studied for over a decade,and over that time many alignment systems and methods have been developed by researchers in order to find simple 1-to-1 equivalence matches between two ontologies.However,very few alignment systems focus on finding complex correspondences.One reason for this limitation may be that there are no widely accepted alignment benchmarks that contain such complex relationships.In this paper,we propose a real-world data set from the GeoLink project as a potential complex ontology alignment benchmark.The data set consists of two ontologies,the GeoLink Base Ontology(GBO)and the GeoLink Modular Ontology(GMO),as well as a manually created reference alignment that was developed in consultation with domain experts from different institutions.The alignment includes 1:1,1:n,and m:n equivalence and subsumption correspondences,and is available in both Expressive and Declarative Ontology Alignment Language(EDOAL)and rule syntax.The benchmark has been expanded from its original version to contain real-world instance data from seven geoscience data providers that has been published according to both ontologies.This allows it to be used by extensional alignment systems or those that require training data.This benchmark has been incorporated into the Ontology Alignment Evaluation Initiative(OAEI)complex track to help researchers test their automated alignment systems and algorithms.This paper also analyzes the challenges inherent in effectively generating,detecting,and evaluating complex ontology alignments and provides a road map for future work on this topic.
基金Supported by the National Natural Science Foundation of China(61170026)
文摘Ontology occupies an important position in artificial intelligence,computer linguistics and knowledge management.However,when different ontologies are constructed to represent the same information in a domain,the so-called heterogeneity problem arises.In order to address this problem,a key task is to discover the semantic relationship of entities between given two ontologies,called ontology alignment.Recently,the meta-heuristic algorithms have already been regarded as an effective approach for solving ontology alignment problem.However,firstly,as the ontologies become increasingly large,meta-heuristic algorithms may be easier to find local optimal alignment in large search spaces.Secondly,many existing approaches exploit the population-based meta-heuristic algorithms so that the massive calculation is required.In this paper,an improved compact particle swarm algorithm by using a local search strategy is proposed,called LSCPSOA,to improve the performance of finding more correct correspondences.In LSCPSOA,two update strategies with local search capability are employed to avoid falling into a local optimal alignment.The proposed algorithm has been evaluated on several large ontology data sets and compared with existing ontology alignment methods.The experimental results show that the proposed algorithm can find more correct correspondences and improves the time performance compared with other meta-heuristic algorithms.