Currently, schema integration frameworks use approaches like rule-based, machine learning, etc. This paper presents an ontology-based wrapper-mediator framework that uses both the rule-based and machine learning strat...Currently, schema integration frameworks use approaches like rule-based, machine learning, etc. This paper presents an ontology-based wrapper-mediator framework that uses both the rule-based and machine learning strategies at the same time. The proposed framework uses global and local ontologies for resolving syntactic and semantic heterogeneity, and XML for interoperability. The concepts in the candidate schemas are merged on the basis of the similarity coefficient, which is calculated using the defined rules and the prior mappings stored in the case-base.展开更多
ETL (Extract-Transform-Load) usually includes three phases: extraction, transformation, and loading. In building data warehouse, it plays the role of data injection and is the most time-consuming activity. Thus it ...ETL (Extract-Transform-Load) usually includes three phases: extraction, transformation, and loading. In building data warehouse, it plays the role of data injection and is the most time-consuming activity. Thus it is necessary to improve the performance of ETL. In this paper, a new ETL approach, TEL (Transform-Extract-Load) is proposed. The TEL approach applies virtual tables to realize the transformation stage before extraction stage and loading stage, without data staging area or staging database which stores raw data extracted from each of the disparate source data systems. The TEL approach reduces the data transmission load, and improves the performance of query from access layers. Experimental results based on our proposed benchmarks show that the TEL approach is feasible and practical.展开更多
文摘Currently, schema integration frameworks use approaches like rule-based, machine learning, etc. This paper presents an ontology-based wrapper-mediator framework that uses both the rule-based and machine learning strategies at the same time. The proposed framework uses global and local ontologies for resolving syntactic and semantic heterogeneity, and XML for interoperability. The concepts in the candidate schemas are merged on the basis of the similarity coefficient, which is calculated using the defined rules and the prior mappings stored in the case-base.
文摘ETL (Extract-Transform-Load) usually includes three phases: extraction, transformation, and loading. In building data warehouse, it plays the role of data injection and is the most time-consuming activity. Thus it is necessary to improve the performance of ETL. In this paper, a new ETL approach, TEL (Transform-Extract-Load) is proposed. The TEL approach applies virtual tables to realize the transformation stage before extraction stage and loading stage, without data staging area or staging database which stores raw data extracted from each of the disparate source data systems. The TEL approach reduces the data transmission load, and improves the performance of query from access layers. Experimental results based on our proposed benchmarks show that the TEL approach is feasible and practical.