Data transformation is the core process in migrating database from relational database to NoSQL database such as column-oriented database. However,there is no standard guideline for data transformation from relationa...Data transformation is the core process in migrating database from relational database to NoSQL database such as column-oriented database. However,there is no standard guideline for data transformation from relational database toNoSQL database. A number of schema transformation techniques have been proposed to improve data transformation process and resulted better query processingtime when compared to the relational database query processing time. However,these approaches produced redundant tables in the resulted schema that in turnconsume large unnecessary storage size and produce high query processing timedue to the generated schema with redundant column families in the transformedcolumn-oriented database. In this paper, an efficient data transformation techniquefrom relational database to column-oriented database is proposed. The proposedschema transformation technique is based on the combination of denormalizationapproach, data access pattern and multiple-nested schema. In order to validate theproposed work, the proposed technique is implemented by transforming data fromMySQL database to MongoDB database. A benchmark transformation techniqueis also performed in which the query processing time and the storage size arecompared. Based on the experimental results, the proposed transformation technique showed significant improvement in terms query processing time and storagespace usage due to the reduced number of column families in the column-orienteddatabase.展开更多
The book chapter is an extended version of the research paper entitled “Use of Component Integration Services in Multidatabase Systems”, which is presented and published by the 13<sup>th</sup> ISITA, the...The book chapter is an extended version of the research paper entitled “Use of Component Integration Services in Multidatabase Systems”, which is presented and published by the 13<sup>th</sup> ISITA, the National Conference of Recent Trends in Mathematical and Computer Sciences, T.M.B. University, Bhagalpur, India, January 3-4, 2015. Information is widely distributed across many remote, distributed, and autonomous databases (local component databases) in heterogeneous formats. The integration of heterogeneous remote databases is a difficult task, and it has already been addressed by several projects to certain extents. In this chapter, we have discussed how to integrate heterogeneous distributed local relational databases because of their simplicity, excellent security, performance, power, flexibility, data independence, support for new hardware technologies, and spread across the globe. We have also discussed how to constitute a global conceptual schema in the multidatabase system using Sybase Adaptive Server Enterprise’s Component Integration Services (CIS) and OmniConnect. This is feasible for higher education institutions and commercial industries as well. Considering the higher educational institutions, the CIS will improve IT integration for educational institutions with their subsidiaries or with other institutions within the country and abroad in terms of educational management, teaching, learning, and research, including promoting international students’ academic integration, collaboration, and governance. This will prove an innovative strategy to support the modernization and large expansion of academic institutions. This will be considered IT-institutional alignment within a higher education context. This will also support achieving one of the sustainable development goals set by the United Nations: “Goal 4: ensure inclusive and quality education for all and promote lifelong learning”. However, the process of IT integration into higher educational institutions must be thoroughly evaluated, identifying the vital data access points. In this chapter, Section 1 provides an introduction, including the evolution of various database systems, data models, and the emergence of multidatabase systems and their importance. Section 2 discusses component integration services (CIS), OmniConnect and considering heterogeneous relational distributed local databases from the perspective of academics, Section 3 discusses the Sybase Adaptive Server Enterprise (ASE), Section 4 discusses the role of component integration services and OmniConnect of Sybase ASE under the Multidatabase System, Section 5 shows the database architectural framework, Section 6 provides an implementation overview of the global conceptual schema in the multidatabase system, Section 7 discusses query processing in the CIS, and finally, Section 8 concludes the chapter. The chapter will help our students a lot, as we have discussed well the evolution of databases and data models and the emergence of multidatabases. Since some additional useful information is cited, the source of information for each citation is properly mentioned in the references column.展开更多
With widespread use of relational database in various real-life applications,maintaining integrity and providing copyright protection is gaining keen interest of the researchers.For this purpose,watermarking has been ...With widespread use of relational database in various real-life applications,maintaining integrity and providing copyright protection is gaining keen interest of the researchers.For this purpose,watermarking has been used for quite a long time.Watermarking requires the role of trusted third party and a mechanism to extract digital signatures(watermark)to prove the ownership of the data under dispute.This is often inefficient as lots of processing is required.Moreover,certain malicious attacks,like additive attacks,can give rise to a situation when more than one parties can claim the ownership of the same data by inserting and detecting their own set of watermarks from the same data.To solve this problem,we propose to use blockchain technology—as trusted third party—along with watermarking for providing a means of rights protection of relational databases.Using blockchain for writing the copyright information alongside watermarking helps to secure the watermark as changing the blockchain is very difficult.This way,we combined the resilience of our watermarking scheme and the strength of blockchain technology—for protecting the digital rights information from alteration—to design and implement a robust scheme for digital right protection of relational databases.Moreover,we also discuss how the proposed scheme can also be used for version control.The proposed technique works with nonnumeric features of relational database and does not target only selected tuple or portion(subset)from the database for watermark embedding unlike most of the existing techniques;as a result,the chances of subset selection containing no watermark decrease automatically.The proposed technique employs zerowatermarking approach and hence no intentional error(watermark)is added to the original dataset.The results of the experiments proved the effectiveness of the proposed scheme.展开更多
Data Mining, also known as knowledge discovery in data (KDC), is the process of uncovering patterns and other valuable information from large data sets. According to https://www.geeksforgeeks.org/data-mining/, it can ...Data Mining, also known as knowledge discovery in data (KDC), is the process of uncovering patterns and other valuable information from large data sets. According to https://www.geeksforgeeks.org/data-mining/, it can be referred to as knowledge mining from data, knowledge extraction, data/pattern analysis, data archaeology, and data dredging. With advance research in health sector, there is multitude of Data available in healthcare sector. The general problem then becomes how to use the existing information in a more useful targeted way. Data Mining therefore is the best available technique. The objective of this paper is to review and analyse some of the different Data Mining Techniques such as Application, Classification, Clustering, Regression, etc. applied in the Domain of Healthcare.展开更多
In traditional database applications, queries intend to retrieve data satisfying precise conditions. As a result, thousands of data can be retrieved (overabundant answer) or, even worse, no data at all (empty answer)....In traditional database applications, queries intend to retrieve data satisfying precise conditions. As a result, thousands of data can be retrieved (overabundant answer) or, even worse, no data at all (empty answer). In both cases, the queries must be reformulated to produce more significant results and, typically, many related queries are submitted by a user before he can be finally satisfied. To overcome these problems, this paper proposes a unified solution in the framework of flexible queries with fuzzy semantics. This solution, based on the concept of semantic proximity and implemented in a tool for flexible query answering, allows the automatic reformulation of queries with empty or overabundant answers.展开更多
Following the completion of genome sequencing of model plants,such as rice(Oryza sativa L.)and Arabidopsis thaliana,the era of functional plant genomics has arrived which provides a solid basis for the develop-ment of...Following the completion of genome sequencing of model plants,such as rice(Oryza sativa L.)and Arabidopsis thaliana,the era of functional plant genomics has arrived which provides a solid basis for the develop-ment of plant proteomics.We review the background and concepts of proteomics,as well as the key techniques which include:(1)separation techniques such as 2-DE(two-dimensional electrophoresis),RP-HPLC(reverse phase high performance liquid chromatography)and SELDI(surface enhanced laser desorption/ionization)protein chip;(2)mass spectrometry such as MALDI-TOF-MS(matrix assisted laser desorption/ionization-time of flight-mass spectrometry)and ESI-MS/MS(electrospray ionization mass spectrometry/mass spectrometry);(3)Peptide sequence tags;(4)databases related to proteomics;(5)quantitative proteome;(6)TAP(tandem affinity purification)and(7)yeast two-hybrid system.In addition,the challenges and prospects of pro-teomics are also discussed.展开更多
基金supported by Universiti Putra Malaysia Grant Scheme(Putra Grant)(GP/2020/9692500).
文摘Data transformation is the core process in migrating database from relational database to NoSQL database such as column-oriented database. However,there is no standard guideline for data transformation from relational database toNoSQL database. A number of schema transformation techniques have been proposed to improve data transformation process and resulted better query processingtime when compared to the relational database query processing time. However,these approaches produced redundant tables in the resulted schema that in turnconsume large unnecessary storage size and produce high query processing timedue to the generated schema with redundant column families in the transformedcolumn-oriented database. In this paper, an efficient data transformation techniquefrom relational database to column-oriented database is proposed. The proposedschema transformation technique is based on the combination of denormalizationapproach, data access pattern and multiple-nested schema. In order to validate theproposed work, the proposed technique is implemented by transforming data fromMySQL database to MongoDB database. A benchmark transformation techniqueis also performed in which the query processing time and the storage size arecompared. Based on the experimental results, the proposed transformation technique showed significant improvement in terms query processing time and storagespace usage due to the reduced number of column families in the column-orienteddatabase.
文摘The book chapter is an extended version of the research paper entitled “Use of Component Integration Services in Multidatabase Systems”, which is presented and published by the 13<sup>th</sup> ISITA, the National Conference of Recent Trends in Mathematical and Computer Sciences, T.M.B. University, Bhagalpur, India, January 3-4, 2015. Information is widely distributed across many remote, distributed, and autonomous databases (local component databases) in heterogeneous formats. The integration of heterogeneous remote databases is a difficult task, and it has already been addressed by several projects to certain extents. In this chapter, we have discussed how to integrate heterogeneous distributed local relational databases because of their simplicity, excellent security, performance, power, flexibility, data independence, support for new hardware technologies, and spread across the globe. We have also discussed how to constitute a global conceptual schema in the multidatabase system using Sybase Adaptive Server Enterprise’s Component Integration Services (CIS) and OmniConnect. This is feasible for higher education institutions and commercial industries as well. Considering the higher educational institutions, the CIS will improve IT integration for educational institutions with their subsidiaries or with other institutions within the country and abroad in terms of educational management, teaching, learning, and research, including promoting international students’ academic integration, collaboration, and governance. This will prove an innovative strategy to support the modernization and large expansion of academic institutions. This will be considered IT-institutional alignment within a higher education context. This will also support achieving one of the sustainable development goals set by the United Nations: “Goal 4: ensure inclusive and quality education for all and promote lifelong learning”. However, the process of IT integration into higher educational institutions must be thoroughly evaluated, identifying the vital data access points. In this chapter, Section 1 provides an introduction, including the evolution of various database systems, data models, and the emergence of multidatabase systems and their importance. Section 2 discusses component integration services (CIS), OmniConnect and considering heterogeneous relational distributed local databases from the perspective of academics, Section 3 discusses the Sybase Adaptive Server Enterprise (ASE), Section 4 discusses the role of component integration services and OmniConnect of Sybase ASE under the Multidatabase System, Section 5 shows the database architectural framework, Section 6 provides an implementation overview of the global conceptual schema in the multidatabase system, Section 7 discusses query processing in the CIS, and finally, Section 8 concludes the chapter. The chapter will help our students a lot, as we have discussed well the evolution of databases and data models and the emergence of multidatabases. Since some additional useful information is cited, the source of information for each citation is properly mentioned in the references column.
基金This project was supported by University of Jeddah under the Grant Number(UJ-02-014-ICGR).
文摘With widespread use of relational database in various real-life applications,maintaining integrity and providing copyright protection is gaining keen interest of the researchers.For this purpose,watermarking has been used for quite a long time.Watermarking requires the role of trusted third party and a mechanism to extract digital signatures(watermark)to prove the ownership of the data under dispute.This is often inefficient as lots of processing is required.Moreover,certain malicious attacks,like additive attacks,can give rise to a situation when more than one parties can claim the ownership of the same data by inserting and detecting their own set of watermarks from the same data.To solve this problem,we propose to use blockchain technology—as trusted third party—along with watermarking for providing a means of rights protection of relational databases.Using blockchain for writing the copyright information alongside watermarking helps to secure the watermark as changing the blockchain is very difficult.This way,we combined the resilience of our watermarking scheme and the strength of blockchain technology—for protecting the digital rights information from alteration—to design and implement a robust scheme for digital right protection of relational databases.Moreover,we also discuss how the proposed scheme can also be used for version control.The proposed technique works with nonnumeric features of relational database and does not target only selected tuple or portion(subset)from the database for watermark embedding unlike most of the existing techniques;as a result,the chances of subset selection containing no watermark decrease automatically.The proposed technique employs zerowatermarking approach and hence no intentional error(watermark)is added to the original dataset.The results of the experiments proved the effectiveness of the proposed scheme.
文摘Data Mining, also known as knowledge discovery in data (KDC), is the process of uncovering patterns and other valuable information from large data sets. According to https://www.geeksforgeeks.org/data-mining/, it can be referred to as knowledge mining from data, knowledge extraction, data/pattern analysis, data archaeology, and data dredging. With advance research in health sector, there is multitude of Data available in healthcare sector. The general problem then becomes how to use the existing information in a more useful targeted way. Data Mining therefore is the best available technique. The objective of this paper is to review and analyse some of the different Data Mining Techniques such as Application, Classification, Clustering, Regression, etc. applied in the Domain of Healthcare.
基金supported by CNPq(Brazilian National Counsel of Technological and Scientific Development),under grant numbers 305484/2012-5 and 104200/2013-8.
文摘In traditional database applications, queries intend to retrieve data satisfying precise conditions. As a result, thousands of data can be retrieved (overabundant answer) or, even worse, no data at all (empty answer). In both cases, the queries must be reformulated to produce more significant results and, typically, many related queries are submitted by a user before he can be finally satisfied. To overcome these problems, this paper proposes a unified solution in the framework of flexible queries with fuzzy semantics. This solution, based on the concept of semantic proximity and implemented in a tool for flexible query answering, allows the automatic reformulation of queries with empty or overabundant answers.
文摘Following the completion of genome sequencing of model plants,such as rice(Oryza sativa L.)and Arabidopsis thaliana,the era of functional plant genomics has arrived which provides a solid basis for the develop-ment of plant proteomics.We review the background and concepts of proteomics,as well as the key techniques which include:(1)separation techniques such as 2-DE(two-dimensional electrophoresis),RP-HPLC(reverse phase high performance liquid chromatography)and SELDI(surface enhanced laser desorption/ionization)protein chip;(2)mass spectrometry such as MALDI-TOF-MS(matrix assisted laser desorption/ionization-time of flight-mass spectrometry)and ESI-MS/MS(electrospray ionization mass spectrometry/mass spectrometry);(3)Peptide sequence tags;(4)databases related to proteomics;(5)quantitative proteome;(6)TAP(tandem affinity purification)and(7)yeast two-hybrid system.In addition,the challenges and prospects of pro-teomics are also discussed.