Multidatabase systems are designed to achieve schema integration and data interoperation among distributed and heterogeneous database systems. But data model heterogeneity and schema heterogeneity make this a challeng...Multidatabase systems are designed to achieve schema integration and data interoperation among distributed and heterogeneous database systems. But data model heterogeneity and schema heterogeneity make this a challenging task. A multidatabase common data model is firstly introduced based on XML, named XML-based Integration Data Model (XIDM), which is suitable for integrating different types of schemas. Then an approach of schema mappings based on XIDM in multidatabase systems has been presented. The mappings include global mappings, dealing with horizontal and vertical partitioning between global schemas and export schemas, and local mappings, processing the transformation between export schemas and local schemas. Finally, the illustration and implementation of schema mappings in a multidatabase prototype - Panorama system are also discussed. The implementation results demonstrate that the XIDM is an efficient model for managing multiple heterogeneous data sources and the approaches of schema mapping based on XIDM behave very well when integrating relational, object-oriented database systems and other file systems.展开更多
Most semi-structured data are of certain structure regularity. Having beenstored as structured data in relational database (RDB), they can be effectively managed by databasemanagement system (DBMS). Some semi-structur...Most semi-structured data are of certain structure regularity. Having beenstored as structured data in relational database (RDB), they can be effectively managed by databasemanagement system (DBMS). Some semi-structured data are difficult to transform due to theirirregular structures. We design an efficient algorithm and data structure for ensuring losslesstransformation. We bring forward an approach of schema extraction through data mining, in whichdifferent kinds of elements are transformed respectively and lossless mapping from semi-structureddata to structured data can be achieved.展开更多
In this paper, we first introduce the architecture for a CORBA-based multidatabase system and then give an approach of schema mapping between XML and relational database system. Finally, we investigate the ability to ...In this paper, we first introduce the architecture for a CORBA-based multidatabase system and then give an approach of schema mapping between XML and relational database system. Finally, we investigate the ability to apply the integration of XML with CORBA-based multidatabase systems. This integration extends the ability of a CORBA-based multidatabase system to implement data sharing and interoperability. Key words heterogeneity - XML - schema mapping - multidatabase systems CLC number TP311. 13 Foundation item: Supported by the National Key Technologies R&D Program of China(2002BA103A04)Biography: LI Rui-xuan(1974-), male, Ph. D candidate, research interests: heterogeneous information integration.展开更多
Semantic conflict is the conflict caused by using different ways in heterogeneous systems to express the same entity in reality. This prevents information integration from accomplishing semantic coherence. Since ontol...Semantic conflict is the conflict caused by using different ways in heterogeneous systems to express the same entity in reality. This prevents information integration from accomplishing semantic coherence. Since ontology helps to solve semantic problems, this area has become a hot topic in information integration. In this paper, we introduce semantic conflict into information integration of heterogeneous applications. We discuss the origins and categories of the conflict, and present an ontology-based schema mapping approach to eliminate semantic conflicts. Key words ontology - CCSOL - semantic conflict - schema mapping CLC number TP 301 Biography: LU Han (1980-), male, Master candidate, research direction: ontology and information integration.展开更多
Update management is very important for data integration systems. So update management in peer data management systems (PDMSs) is a hot research area. This paper researches on view maintenance in PDMSs. First, the d...Update management is very important for data integration systems. So update management in peer data management systems (PDMSs) is a hot research area. This paper researches on view maintenance in PDMSs. First, the definition of view is extended and the peer view, local view and global view are proposed according to the requirements of applications. There are two main factors to influence materialized views in PDMSs. One is that schema mappings between peers are changed, and the other is that peers update their data. Based on the requirements, this paper proposes an algorithm called 2DCMA, which includes two sub-algorithms: data and definition consistency maintenance algorithm% to effectively maintain views. For data consistency maintenance, Mork's rules are extended for governing the use of updategrams and boosters. The new rule system can be used to optimize the execution plan. And are extended for the data consistency maintenance algorithm is based on the new rule system. Furthermore, an ECA rule is adopted for definition consistency maintenance. Finally, extensive simulation experiments are conducted in SPDMS. The simulation results show that the 2DCMA algorithm has better performance than that of Mork's when maintaining data consistency. And the 2DCMA algorithm has better performance than that of centralized view maintenance algorithm when maintaining definition consistency.展开更多
文摘Multidatabase systems are designed to achieve schema integration and data interoperation among distributed and heterogeneous database systems. But data model heterogeneity and schema heterogeneity make this a challenging task. A multidatabase common data model is firstly introduced based on XML, named XML-based Integration Data Model (XIDM), which is suitable for integrating different types of schemas. Then an approach of schema mappings based on XIDM in multidatabase systems has been presented. The mappings include global mappings, dealing with horizontal and vertical partitioning between global schemas and export schemas, and local mappings, processing the transformation between export schemas and local schemas. Finally, the illustration and implementation of schema mappings in a multidatabase prototype - Panorama system are also discussed. The implementation results demonstrate that the XIDM is an efficient model for managing multiple heterogeneous data sources and the approaches of schema mapping based on XIDM behave very well when integrating relational, object-oriented database systems and other file systems.
文摘Most semi-structured data are of certain structure regularity. Having beenstored as structured data in relational database (RDB), they can be effectively managed by databasemanagement system (DBMS). Some semi-structured data are difficult to transform due to theirirregular structures. We design an efficient algorithm and data structure for ensuring losslesstransformation. We bring forward an approach of schema extraction through data mining, in whichdifferent kinds of elements are transformed respectively and lossless mapping from semi-structureddata to structured data can be achieved.
文摘In this paper, we first introduce the architecture for a CORBA-based multidatabase system and then give an approach of schema mapping between XML and relational database system. Finally, we investigate the ability to apply the integration of XML with CORBA-based multidatabase systems. This integration extends the ability of a CORBA-based multidatabase system to implement data sharing and interoperability. Key words heterogeneity - XML - schema mapping - multidatabase systems CLC number TP311. 13 Foundation item: Supported by the National Key Technologies R&D Program of China(2002BA103A04)Biography: LI Rui-xuan(1974-), male, Ph. D candidate, research interests: heterogeneous information integration.
文摘Semantic conflict is the conflict caused by using different ways in heterogeneous systems to express the same entity in reality. This prevents information integration from accomplishing semantic coherence. Since ontology helps to solve semantic problems, this area has become a hot topic in information integration. In this paper, we introduce semantic conflict into information integration of heterogeneous applications. We discuss the origins and categories of the conflict, and present an ontology-based schema mapping approach to eliminate semantic conflicts. Key words ontology - CCSOL - semantic conflict - schema mapping CLC number TP 301 Biography: LU Han (1980-), male, Master candidate, research direction: ontology and information integration.
基金This work is supported by the National Natural Science Foundation of China under Grant Nos. 60503038, 60473069, 60496325 and 60573092. The authors would like to thank Peter Mork for his comments on the extended rule system, and also thank the anonymous referees for their invaluable comments.
文摘Update management is very important for data integration systems. So update management in peer data management systems (PDMSs) is a hot research area. This paper researches on view maintenance in PDMSs. First, the definition of view is extended and the peer view, local view and global view are proposed according to the requirements of applications. There are two main factors to influence materialized views in PDMSs. One is that schema mappings between peers are changed, and the other is that peers update their data. Based on the requirements, this paper proposes an algorithm called 2DCMA, which includes two sub-algorithms: data and definition consistency maintenance algorithm% to effectively maintain views. For data consistency maintenance, Mork's rules are extended for governing the use of updategrams and boosters. The new rule system can be used to optimize the execution plan. And are extended for the data consistency maintenance algorithm is based on the new rule system. Furthermore, an ECA rule is adopted for definition consistency maintenance. Finally, extensive simulation experiments are conducted in SPDMS. The simulation results show that the 2DCMA algorithm has better performance than that of Mork's when maintaining data consistency. And the 2DCMA algorithm has better performance than that of centralized view maintenance algorithm when maintaining definition consistency.