The book chapter is an extended version of the research paper entitled “Use of Component Integration Services in Multidatabase Systems”, which is presented and published by the 13<sup>th</sup> ISITA, the...The book chapter is an extended version of the research paper entitled “Use of Component Integration Services in Multidatabase Systems”, which is presented and published by the 13<sup>th</sup> ISITA, the National Conference of Recent Trends in Mathematical and Computer Sciences, T.M.B. University, Bhagalpur, India, January 3-4, 2015. Information is widely distributed across many remote, distributed, and autonomous databases (local component databases) in heterogeneous formats. The integration of heterogeneous remote databases is a difficult task, and it has already been addressed by several projects to certain extents. In this chapter, we have discussed how to integrate heterogeneous distributed local relational databases because of their simplicity, excellent security, performance, power, flexibility, data independence, support for new hardware technologies, and spread across the globe. We have also discussed how to constitute a global conceptual schema in the multidatabase system using Sybase Adaptive Server Enterprise’s Component Integration Services (CIS) and OmniConnect. This is feasible for higher education institutions and commercial industries as well. Considering the higher educational institutions, the CIS will improve IT integration for educational institutions with their subsidiaries or with other institutions within the country and abroad in terms of educational management, teaching, learning, and research, including promoting international students’ academic integration, collaboration, and governance. This will prove an innovative strategy to support the modernization and large expansion of academic institutions. This will be considered IT-institutional alignment within a higher education context. This will also support achieving one of the sustainable development goals set by the United Nations: “Goal 4: ensure inclusive and quality education for all and promote lifelong learning”. However, the process of IT integration into higher educational institutions must be thoroughly evaluated, identifying the vital data access points. In this chapter, Section 1 provides an introduction, including the evolution of various database systems, data models, and the emergence of multidatabase systems and their importance. Section 2 discusses component integration services (CIS), OmniConnect and considering heterogeneous relational distributed local databases from the perspective of academics, Section 3 discusses the Sybase Adaptive Server Enterprise (ASE), Section 4 discusses the role of component integration services and OmniConnect of Sybase ASE under the Multidatabase System, Section 5 shows the database architectural framework, Section 6 provides an implementation overview of the global conceptual schema in the multidatabase system, Section 7 discusses query processing in the CIS, and finally, Section 8 concludes the chapter. The chapter will help our students a lot, as we have discussed well the evolution of databases and data models and the emergence of multidatabases. Since some additional useful information is cited, the source of information for each citation is properly mentioned in the references column.展开更多
An important task in database integration is to resolve data conflicts, on both schema-level and semantic-level. Especially difficult the latter is. Some existing ontology-based approaches have been criticized for the...An important task in database integration is to resolve data conflicts, on both schema-level and semantic-level. Especially difficult the latter is. Some existing ontology-based approaches have been criticized for their lack of domain generality and semantic richness. With the aim to overcome these limitations, this paper introduces a systematic approach for detecting and resolving various semantic conflicts in heterogeneous databases, which includes two important parts: a semantic conflict representation model based on our classification framework of semantic conflicts, and a methodology for detecting and resolving semantic conflicts based on this model. The system has been developed, experimental evaluations on which indicate that this approach can resolve much of the semantic conflicts effectively, and keep independent of domains and integration patterns.展开更多
Construction of integrated database including casting shapes with their casting design, technical knowledge, and thermophysical properties of the casting alloys were introduced in the present study. Recognition tech- ...Construction of integrated database including casting shapes with their casting design, technical knowledge, and thermophysical properties of the casting alloys were introduced in the present study. Recognition tech- nique for casting design by industrial computer tomography was used for the construction of shape database. Technical knowledge of the casting processes such as ferrous and non-ferrous alloys and their manufacturing process of the castings were accumulated and the search engine for the knowledge was developed. Database of thermophysical properties of the casting alloys were obtained via the experimental study, and the properties were used for the in-house computer simulation of casting process. The databases were linked with intelligent casting expert system developed in center for e-design, KITECH. It is expected that the databases can help non casting experts to devise the casting and its process. Various examples of the application by using the databases were shown in the present study.展开更多
Objective: Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of mJcroa...Objective: Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of mJcroarray data was presented, by combined with evidence acquired from comparative genornic hybridization (CGH) data. Methods: Gene expression profile data of CRC samples were obtained at Gene Expression Omnibus (GEO) website. The 15 important chromosomal aberration sites detected by using CGH technology were used for integrated genomic and transcriptomic analysis. Significant Analysis of Microarray (SAM) was used to detect significantly differentially expressed genes across the whole genome. The overlapping genes were selected in their corresponding chromosomal aberration regions, and analyzed by using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Finally, SVM-T-RFE gene selection algorithm was applied to identify ted genes in CRC. Results: A minimum gene set was obtained with the minimum number [14] of genes, and the highest classification accuracy (100%) in both PRI and META datasets. A fraction of selected genes are associated with CRC or its metastasis. Conclusions- Our results demonstrated that integration analysis is an effective strategy for mining cancer- associated genes.展开更多
Currently, schema integration frameworks use approaches like rule-based, machine learning, etc. This paper presents an ontology-based wrapper-mediator framework that uses both the rule-based and machine learning strat...Currently, schema integration frameworks use approaches like rule-based, machine learning, etc. This paper presents an ontology-based wrapper-mediator framework that uses both the rule-based and machine learning strategies at the same time. The proposed framework uses global and local ontologies for resolving syntactic and semantic heterogeneity, and XML for interoperability. The concepts in the candidate schemas are merged on the basis of the similarity coefficient, which is calculated using the defined rules and the prior mappings stored in the case-base.展开更多
Fundamentally, semantic grid database is about bringing globally distributed databases together in order to coordinate resource sharing and problem solving in which information is given well-defined meaning, and DartG...Fundamentally, semantic grid database is about bringing globally distributed databases together in order to coordinate resource sharing and problem solving in which information is given well-defined meaning, and DartGrid II is the implemented database gird system whose goal is to provide a semantic solution for integrating database resources on the Web. Although many algorithms have been proposed for optimizing query-processing in order to minimize costs and/or response time, associated with obtaining the answer to query in a distributed database system, database grid query optimization problem is fundamentally different from traditional distributed query optimization. These differences are shown to be the consequences of autonomy and heterogeneity of database nodes in database grid. Therefore, more challenges have arisen for query optimization in database grid than traditional distributed database. Following this observation, the design of a query optimizer in DartGrid II is presented, and a heuristic, dynamic and parallel query optimization approach to processing query in database grid is proposed. A set of semantic tools supporting relational database integration and semantic-based information browsing has also been implemented to realize the above vision.展开更多
The problem of matching schemas or ontologies consists of providing corresponding entities in two or more knowledge models that belong to a same domain but have been developed separately. Nowadays there are a lot of t...The problem of matching schemas or ontologies consists of providing corresponding entities in two or more knowledge models that belong to a same domain but have been developed separately. Nowadays there are a lot of techniques and tools for addressing this problem, however, the complex nature of the matching problem make existing solutions for real situations not fully satisfactory. The Google Similarity Distance has appeared recently. Its purpose is to mine knowledge from the Web using the Google search engine in order to semantically compare text expressions. Our work consists of developing a software application for validating results discovered by schema and ontolog2/ matching tools using the philosophy behind this distance. Moreover, we are interested in using not only Google, but other popular search engines with this similarity distance. The results reveal three main facts. Firstly, some web search engines can help us to validate semantic correspondences satisfactorily. Secondly there are significant differences among the web search engines. And thirdly the best results are obtained when using combinations of the web search engines that we have studied.展开更多
文摘The book chapter is an extended version of the research paper entitled “Use of Component Integration Services in Multidatabase Systems”, which is presented and published by the 13<sup>th</sup> ISITA, the National Conference of Recent Trends in Mathematical and Computer Sciences, T.M.B. University, Bhagalpur, India, January 3-4, 2015. Information is widely distributed across many remote, distributed, and autonomous databases (local component databases) in heterogeneous formats. The integration of heterogeneous remote databases is a difficult task, and it has already been addressed by several projects to certain extents. In this chapter, we have discussed how to integrate heterogeneous distributed local relational databases because of their simplicity, excellent security, performance, power, flexibility, data independence, support for new hardware technologies, and spread across the globe. We have also discussed how to constitute a global conceptual schema in the multidatabase system using Sybase Adaptive Server Enterprise’s Component Integration Services (CIS) and OmniConnect. This is feasible for higher education institutions and commercial industries as well. Considering the higher educational institutions, the CIS will improve IT integration for educational institutions with their subsidiaries or with other institutions within the country and abroad in terms of educational management, teaching, learning, and research, including promoting international students’ academic integration, collaboration, and governance. This will prove an innovative strategy to support the modernization and large expansion of academic institutions. This will be considered IT-institutional alignment within a higher education context. This will also support achieving one of the sustainable development goals set by the United Nations: “Goal 4: ensure inclusive and quality education for all and promote lifelong learning”. However, the process of IT integration into higher educational institutions must be thoroughly evaluated, identifying the vital data access points. In this chapter, Section 1 provides an introduction, including the evolution of various database systems, data models, and the emergence of multidatabase systems and their importance. Section 2 discusses component integration services (CIS), OmniConnect and considering heterogeneous relational distributed local databases from the perspective of academics, Section 3 discusses the Sybase Adaptive Server Enterprise (ASE), Section 4 discusses the role of component integration services and OmniConnect of Sybase ASE under the Multidatabase System, Section 5 shows the database architectural framework, Section 6 provides an implementation overview of the global conceptual schema in the multidatabase system, Section 7 discusses query processing in the CIS, and finally, Section 8 concludes the chapter. The chapter will help our students a lot, as we have discussed well the evolution of databases and data models and the emergence of multidatabases. Since some additional useful information is cited, the source of information for each citation is properly mentioned in the references column.
基金This work is supported by the National natural Science Foundation of China under Grant No. 60573126, the National High-Tech Research and Development 863 Program of China under Grant No. 2004AA112010, the National Grand Fundamental Research 973 Program of China under Grant No. 2002CB312005.
文摘An important task in database integration is to resolve data conflicts, on both schema-level and semantic-level. Especially difficult the latter is. Some existing ontology-based approaches have been criticized for their lack of domain generality and semantic richness. With the aim to overcome these limitations, this paper introduces a systematic approach for detecting and resolving various semantic conflicts in heterogeneous databases, which includes two important parts: a semantic conflict representation model based on our classification framework of semantic conflicts, and a methodology for detecting and resolving semantic conflicts based on this model. The system has been developed, experimental evaluations on which indicate that this approach can resolve much of the semantic conflicts effectively, and keep independent of domains and integration patterns.
文摘Construction of integrated database including casting shapes with their casting design, technical knowledge, and thermophysical properties of the casting alloys were introduced in the present study. Recognition tech- nique for casting design by industrial computer tomography was used for the construction of shape database. Technical knowledge of the casting processes such as ferrous and non-ferrous alloys and their manufacturing process of the castings were accumulated and the search engine for the knowledge was developed. Database of thermophysical properties of the casting alloys were obtained via the experimental study, and the properties were used for the in-house computer simulation of casting process. The databases were linked with intelligent casting expert system developed in center for e-design, KITECH. It is expected that the databases can help non casting experts to devise the casting and its process. Various examples of the application by using the databases were shown in the present study.
基金supported by a grant from the National Natural Science Foundation of China(Grant No.61373057)a grant from the Zhejiang Provincial Natural Science Foundation of China(Grant No.Y1110763)
文摘Objective: Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of mJcroarray data was presented, by combined with evidence acquired from comparative genornic hybridization (CGH) data. Methods: Gene expression profile data of CRC samples were obtained at Gene Expression Omnibus (GEO) website. The 15 important chromosomal aberration sites detected by using CGH technology were used for integrated genomic and transcriptomic analysis. Significant Analysis of Microarray (SAM) was used to detect significantly differentially expressed genes across the whole genome. The overlapping genes were selected in their corresponding chromosomal aberration regions, and analyzed by using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Finally, SVM-T-RFE gene selection algorithm was applied to identify ted genes in CRC. Results: A minimum gene set was obtained with the minimum number [14] of genes, and the highest classification accuracy (100%) in both PRI and META datasets. A fraction of selected genes are associated with CRC or its metastasis. Conclusions- Our results demonstrated that integration analysis is an effective strategy for mining cancer- associated genes.
文摘Currently, schema integration frameworks use approaches like rule-based, machine learning, etc. This paper presents an ontology-based wrapper-mediator framework that uses both the rule-based and machine learning strategies at the same time. The proposed framework uses global and local ontologies for resolving syntactic and semantic heterogeneity, and XML for interoperability. The concepts in the candidate schemas are merged on the basis of the similarity coefficient, which is calculated using the defined rules and the prior mappings stored in the case-base.
文摘Fundamentally, semantic grid database is about bringing globally distributed databases together in order to coordinate resource sharing and problem solving in which information is given well-defined meaning, and DartGrid II is the implemented database gird system whose goal is to provide a semantic solution for integrating database resources on the Web. Although many algorithms have been proposed for optimizing query-processing in order to minimize costs and/or response time, associated with obtaining the answer to query in a distributed database system, database grid query optimization problem is fundamentally different from traditional distributed query optimization. These differences are shown to be the consequences of autonomy and heterogeneity of database nodes in database grid. Therefore, more challenges have arisen for query optimization in database grid than traditional distributed database. Following this observation, the design of a query optimizer in DartGrid II is presented, and a heuristic, dynamic and parallel query optimization approach to processing query in database grid is proposed. A set of semantic tools supporting relational database integration and semantic-based information browsing has also been implemented to realize the above vision.
基金supported by Spanish Ministry of Innovation and Science through REALIDAD:Gestion,Analisis y Explotacion Eficiente de Datos Vinculados under Grant No.TIN2011-25840
文摘The problem of matching schemas or ontologies consists of providing corresponding entities in two or more knowledge models that belong to a same domain but have been developed separately. Nowadays there are a lot of techniques and tools for addressing this problem, however, the complex nature of the matching problem make existing solutions for real situations not fully satisfactory. The Google Similarity Distance has appeared recently. Its purpose is to mine knowledge from the Web using the Google search engine in order to semantically compare text expressions. Our work consists of developing a software application for validating results discovered by schema and ontolog2/ matching tools using the philosophy behind this distance. Moreover, we are interested in using not only Google, but other popular search engines with this similarity distance. The results reveal three main facts. Firstly, some web search engines can help us to validate semantic correspondences satisfactorily. Secondly there are significant differences among the web search engines. And thirdly the best results are obtained when using combinations of the web search engines that we have studied.