For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and dura...For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and durable) properties. Moreover, database partition and migration tools can help transplanting conventional relational database systems to the cloud environment rather than rebuilding a new system. This paper proposes a database distribution management (DBDM) system, which partitions or replicates the data according to the transaction behaviors of the application system. The principle strategy of DBDM is to keep together the data used in a single transaction, and thus, avoiding massive transmission of records in join operations. The proposed system has been implemented successfully. The preliminary experiments show that the DBDM performs the database partition and migration effectively. Also, the DBDM system is modularly designed to adapt to different database management system (DBMS) or different partition algorithms.展开更多
In order to design a new kind of mobile database management system (DBMS)more suitable for mobile computing than the existent DBMS, the essence of database systems in mobilecomputing is analyzed. An opinion is introdu...In order to design a new kind of mobile database management system (DBMS)more suitable for mobile computing than the existent DBMS, the essence of database systems in mobilecomputing is analyzed. An opinion is introduced that the mobile database is a kind of dynamicdistributed database, and the concept of virtual servers to translate the clients' mobility to theservers' mobility is proposed. Based on these opinions, a kind of architecture of mobile DBMS, whichis of versatility, is presented. The architecture is composed of a virtual server and a local DBMS,the virtual server is the kernel of the architecture and its functions are described. Eventually,the server kernel of a mobile DBMS prototype is illustrated.展开更多
To improve the effectiveness of dam safety monitoring database systems, the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mo...To improve the effectiveness of dam safety monitoring database systems, the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mode. The optimal data model was confirmed by identifying data objects, defining relations and reviewing entities. The conversion of relations among entities to external keys and entities and physical attributes to tables and fields was interpreted completely. On this basis, a multi-dimensional database that reflects the management and analysis of a dam safety monitoring system on monitoring data information has been established, for which factual tables and dimensional tables have been designed. Finally, based on service design and user interface design, the dam safety monitoring system has been developed with Delphi as the development tool. This development project shows that the multi-dimensional database can simplify the development process and minimize hidden dangers in the database structure design. It is superior to other dam safety monitoring system development models and can provide a new research direction for system developers.展开更多
Aim To develop a heterogeneous database united system(HDBUS)that combines the local database of Oracle, Sybase and SQL server distributed on different server into a global database,and supports the global transaction...Aim To develop a heterogeneous database united system(HDBUS)that combines the local database of Oracle, Sybase and SQL server distributed on different server into a global database,and supports the global transaction management and parallel query over the Intranet Methods In the designing and implementation of HDBUS two important concepts heterogeneous tables join. Results and Conclu- tion The first concept can be used to process the parallel query of multiple database server, the second one is the key technology of heterogeneous is the key technology of heterogeneous distribute database.展开更多
A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environment...A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environments. At the mobile hosts, all transactions perform local pre-validation. The local pre-validation process is carried out against the committed transactions at the server in the last broadcast cycle. Transactions that survive in local pre-validation must be submitted to the server for local final validation. The new protocol eliminates conflicts between mobile read-only and mobile update transactions, and resolves data conflicts flexibly by using multiversion dynamic adjustment of serialization order to avoid unnecessary restarts of transactions. Mobile read-only transactions can be committed with no-blocking, and respond time of mobile read-only transactions is greatly shortened. The tolerance of mobile transactions of disconnections from the broadcast channel is increased. In global validation mobile distributed transactions have to do check to ensure distributed serializability in all participants. The simulation results show that the new concurrency control protocol proposed offers better performance than other protocols in terms of miss rate, restart rate, commit rate. Under high work load (think time is ls) the miss rate of DMVOCC-MVDA is only 14.6%, is significantly lower than that of other protocols. The restart rate of DMVOCC-MVDA is only 32.3%, showing that DMVOCC-MVDA can effectively reduce the restart rate of mobile transactions. And the commit rate of DMVOCC-MVDA is up to 61.2%, which is obviously higher than that of other protocols.展开更多
We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel que...We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel query optimization, transaction processing system and parallel access method in detail.展开更多
In this paper,the entity_relation data model for integrating spatio_temporal data is designed.In the design,spatio_temporal data can be effectively stored and spatiao_temporal analysis can be easily realized.
Fingerprint⁃based Bluetooth positioning is a popular indoor positioning technology.However,the change of indoor environment and Bluetooth anchor locations has significant impact on signal distribution,which will resul...Fingerprint⁃based Bluetooth positioning is a popular indoor positioning technology.However,the change of indoor environment and Bluetooth anchor locations has significant impact on signal distribution,which will result in the decline of positioning accuracy.The widespread extension of Bluetooth positioning is limited by the need of manual effort to collect the fingerprints with position labels for fingerprint database construction and updating.To address this problem,this paper presents an adaptive fingerprint database updating approach.First,the crowdsourced data including the Bluetooth Received Signal Strength(RSS)sequences and the speed and heading of the pedestrian were recorded.Second,the recorded crowdsourced data were fused by the Kalman Filtering(KF),and then fed into the trajectory validity analysis model with the purpose of assigning the unlabeled RSS data with position labels to generate candidate fingerprints.Third,after enough candidate fingerprints were obtained at each Reference Point(RP),the Density⁃based Spatial Clustering of Applications with Noise(DBSCAN)approach was conducted on both the original and the candidate fingerprints to filter out the fingerprints which had been identified as the noise,and then the mean of fingerprints in the cluster with the largest data volume was selected as the updated fingerprint of the corresponding RP.Finally,the extensive experimental results show that with the increase of the number of candidate fingerprints and update iterations,the fingerprint⁃based Bluetooth positioning accuracy can be effectively improved.展开更多
Schema incompatibility is a major challenge to a federated database systemfor data sharing among heterogeneous,multiple and autonomous databases.This paperpresents a mapping approach based on import schema,export sche...Schema incompatibility is a major challenge to a federated database systemfor data sharing among heterogeneous,multiple and autonomous databases.This paperpresents a mapping approach based on import schema,export schema and domain conver-sion function,through which schema incompatibility problems such as naming conflict,domain incompatibility and entity definition incompatibility can be resolved effectively.The implementation techniques are also discussed.展开更多
An approach to implementing the multimedia database system NHMDB based on NF2(Non-First-Normal-Form) data model is presented. This approach is easily implemented because NF2 structure can efficiently store various med...An approach to implementing the multimedia database system NHMDB based on NF2(Non-First-Normal-Form) data model is presented. This approach is easily implemented because NF2 structure can efficiently store various media data such as formatted data, text,graphics, image and voice. The main idea is to expand conceptual schema to maintain the consistency of tbreelevel schema in system NHMDB. We developed, implemented and experimented the storage structure and the multimedia data representation by an object identifier.implementation techniques are also discussed.展开更多
This paper aims to establish a relative study between a relational Microsoft SQL Server database and a non-relational MongoDB database within the unstructured representation of data in JSON format. There is a great am...This paper aims to establish a relative study between a relational Microsoft SQL Server database and a non-relational MongoDB database within the unstructured representation of data in JSON format. There is a great amount of work done regarding comparison of multiple database management applications on the basis of their performances, security etc., but we have limited information available where these databases are assessed on the basis of provided data. This study will mainly focus on looking at all the possibilities that both these database types offer us when handling data in JSON. We will accomplish this by implementing a series of experiments while taking into consideration that the subjected data does not require to be normalized;and therefore, evaluate the outcome to conclude the result.展开更多
Data-Base Management System (DBMS) is the current standard for storing information. A DBMS organizes and maintains a structure of storage of data. Databases make it possible to store vast amounts of randomly created i...Data-Base Management System (DBMS) is the current standard for storing information. A DBMS organizes and maintains a structure of storage of data. Databases make it possible to store vast amounts of randomly created information and then retrieve items using associative reasoning in search routines. However, design of databases is cumbersome. If one is to use a database primarily to directly input information, each field must be predefined manually, and the fields must be organized to permit coherent data input. This static requirement is problematic and requires that database table(s) be predefined and customized at the outset, a difficult proposition since current DBMS lack a user friendly front end to allow flexible design of the input model. Furthermore, databases are primarily text based, making it difficult to process graphical data. We have developed a general and nonproprietary approach to the problem of input modeling designed to make use of the known informational architecture to map data to a database and then retrieve the original document in freely editable form. We create form templates using ordinary word processing software: Microsoft InfoPath 2007. Each field in the form is given a unique name identifier in order to be distinguished in the database. It is possible to export text based documents created initially in Microsoft Word by placing a colon at the beginning of any desired field location. InfoPath then captures the preceding string and uses it as the label for the field. Each form can be structured in a way to include any combination of both textual and graphical fields. We input data into InfoPath templates. We then submit the data through a web service to populate fields in an SQL database. By appropriate indexing, we can then recall the entire document from the SQL database for editing, with corresponding audit trail. Graphical data is handled no differently than textual data and is embedded in the database itself permitting direct query approaches. This technique makes it possible for general users to benefit from a combined text-graphical database environment with a flexible non-proprietary interface. Consequently, any template can be effortlessly transformed to a database system and easily recovered in a narrative form.展开更多
A distributed processing system (DPS) contains many autonomous nodes, which contribute their own computing power. DPS is considered a unified logical structure, operating in a distributed manner;the processing tasks a...A distributed processing system (DPS) contains many autonomous nodes, which contribute their own computing power. DPS is considered a unified logical structure, operating in a distributed manner;the processing tasks are divided into fragments and assigned to various nodes for processing. That type of operation requires and involves a great deal of communication. We propose to use the decentralized approach, based on a distributed hash table, to reduce the communication overhead and remove the server unit, thus avoiding having a single point of failure in the system. This paper proposes a mathematical model and algorithms that are implemented in a dedicated experimental system. Using the decentralized approach, this study demonstrates the efficient operation of a decentralized system which results in a reduced energy emission.展开更多
Data transformation is the core process in migrating database from relational database to NoSQL database such as column-oriented database. However,there is no standard guideline for data transformation from relationa...Data transformation is the core process in migrating database from relational database to NoSQL database such as column-oriented database. However,there is no standard guideline for data transformation from relational database toNoSQL database. A number of schema transformation techniques have been proposed to improve data transformation process and resulted better query processingtime when compared to the relational database query processing time. However,these approaches produced redundant tables in the resulted schema that in turnconsume large unnecessary storage size and produce high query processing timedue to the generated schema with redundant column families in the transformedcolumn-oriented database. In this paper, an efficient data transformation techniquefrom relational database to column-oriented database is proposed. The proposedschema transformation technique is based on the combination of denormalizationapproach, data access pattern and multiple-nested schema. In order to validate theproposed work, the proposed technique is implemented by transforming data fromMySQL database to MongoDB database. A benchmark transformation techniqueis also performed in which the query processing time and the storage size arecompared. Based on the experimental results, the proposed transformation technique showed significant improvement in terms query processing time and storagespace usage due to the reduced number of column families in the column-orienteddatabase.展开更多
Computer simulation for materials processing needs a huge database containing a great deal of various physical properties of materials. In order to employ the accumulated large data on materials heat treatment in the ...Computer simulation for materials processing needs a huge database containing a great deal of various physical properties of materials. In order to employ the accumulated large data on materials heat treatment in the past years, it is significant to develop an intelligent database system. Based on the data mining technology for data analysis, an intelligent database web tool system of computer simulation for heat treatment process named as IndBASEweb-HT was built up. The architecture and the arithmetic of this system as well as its application were introduced.展开更多
Since the early 1990, significant progress in database technology has provided new platform for emerging new dimensions of data engineering. New models were introduced to utilize the data sets stored in the new genera...Since the early 1990, significant progress in database technology has provided new platform for emerging new dimensions of data engineering. New models were introduced to utilize the data sets stored in the new generations of databases. These models have a deep impact on evolving decision-support systems. But they suffer a variety of practical problems while accessing real-world data sources. Specifically a type of data storage model based on data distribution theory has been increasingly used in recent years by large-scale enterprises, while it is not compatible with existing decision-support models. This data storage model stores the data in different geographical sites where they are more regularly accessed. This leads to considerably less inter-site data transfer that can reduce data security issues in some circumstances and also significantly improve data manipulation transactions speed. The aim of this paper is to propose a new approach for supporting proactive decision-making that utilizes a workable data source management methodology. The new model can effectively organize and use complex data sources, even when they are distributed in different sites in a fragmented form. At the same time, the new model provides a very high level of intellectual management decision-support by intelligent use of the data collections through utilizing new smart methods in synthesizing useful knowledge. The results of an empirical study to evaluate the model are provided.展开更多
A data identifier(DID)is an essential tag or label in all kinds of databases—particularly those related to integrated computational materials engineering(ICME),inheritable integrated intelligent manufacturing(I3M),an...A data identifier(DID)is an essential tag or label in all kinds of databases—particularly those related to integrated computational materials engineering(ICME),inheritable integrated intelligent manufacturing(I3M),and the Industrial Internet ofThings.With the guidance and quick acceleration of the developme nt of advanced materials,as envisioned by official documents worldwide,more investigations are required to construct relative numerical standards for material informatics.This work proposes a universal DID format consisting of a set of build chains,which aligns with the classical form of identifier in both international and national standards,such as ISO/IEC 29168-1:2000,GB/T 27766-2011,GA/T 543.2-2011,GM/T 0006-2012,GJB 7365-2011,SL 325-2014,SL 607-201&WS 363.2-2011,and QX/T 39-2005.Each build chain is made up of capital letters and numbers,with no symbols.Moreover,the total length of each build chain is not restricted,which follows the formation of the Universal Coded Character Set in the international standard of ISO/IEC 10646.Based on these rules,the proposed DID is flexible and convenient for extendi ng and sharing in and between various cloud-based platforms.Accordingly,classical two-dimensional(2D)codes,including the Hanxin Code,Lots Perception Matrix(LP)Code,Quick Response(Q.R)code,Grid Matrix(GM)code,and Data Matrix(DM)Code,can be constructed and precisely recognized and/or decoded by either smart phones or specific machines.By utilizing these 2D codes as the fingerprints of a set of data linked with cloud-based platforms,progress and updates in the composition-processing-structure-property-performance workflow process can be tracked spontaneously,paving a path to accelerate the discovery and manufacture of advanced materials and enhance research productivity,performance,and collaboration.展开更多
A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various form...A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various forms of semi-structured,structured,and unstructured information.These systems use a flat architecture and run different types of data analytics.NoSQL databases are nontabular and store data in a different manner than the relational table.NoSQL databases come in various forms,including key-value pairs,documents,wide columns,and graphs,each based on its data model.They offer simpler scalability and generally outperform traditional relational databases.While NoSQL databases can store diverse data types,they lack full support for atomicity,consistency,isolation,and durability features found in relational databases.Consequently,employing machine learning approaches becomes necessary to categorize complex structured query language(SQL)queries.Results indicate that the most frequently used automatic classification technique in processing SQL queries on NoSQL databases is machine learning-based classification.Overall,this study provides an overview of the automatic classification techniques used in processing SQL queries on NoSQL databases.Understanding these techniques can aid in the development of effective and efficient NoSQL database applications.展开更多
An outsource database is a database service provided by cloud computing companies.Using the outsource database can reduce the hardware and software's cost and also get more efficient and reliable data processing capa...An outsource database is a database service provided by cloud computing companies.Using the outsource database can reduce the hardware and software's cost and also get more efficient and reliable data processing capacity.However,the outsource database still has some challenges.If the service provider does not have sufficient confidence,there is the possibility of data leakage.The data may has user's privacy,so data leakage may cause data privacy leak.Based on this factor,to protect the privacy of data in the outsource database becomes very important.In the past,scholars have proposed k-anonymity to protect data privacy in the database.It lets data become anonymous to avoid data privacy leak.But k-anonymity has some problems,it is irreversible,and easier to be attacked by homogeneity attack and background knowledge attack.Later on,scholars have proposed some studies to solve homogeneity attack and background knowledge attack.But their studies still cannot recover back to the original data.In this paper,we propose a data anonymity method.It can be reversible and also prevent those two attacks.Our study is based on the proposed r-transform.It can be used on the numeric type of attributes in the outsource database.In the experiment,we discussed the time required to anonymize and recover data.Furthermore,we investigated the defense against homogeneous attack and background knowledge attack.At the end,we summarized the proposed method and future researches.展开更多
The data tree table is a type of data structure consisting of data tree and table, which has a wide field of applications. The visual and dynamic growing algorithm of data tree table and its software method are presen...The data tree table is a type of data structure consisting of data tree and table, which has a wide field of applications. The visual and dynamic growing algorithm of data tree table and its software method are presented based on the theory of the data structure and visual technology of software. The method of the expression and management of data tree table with relational mode are explored.展开更多
基金supported by the Taiwan Ministry of Economic Affairs and Institute for Information Industry under the project titled "Fundamental Industrial Technology Development Program (1/4)"
文摘For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and durable) properties. Moreover, database partition and migration tools can help transplanting conventional relational database systems to the cloud environment rather than rebuilding a new system. This paper proposes a database distribution management (DBDM) system, which partitions or replicates the data according to the transaction behaviors of the application system. The principle strategy of DBDM is to keep together the data used in a single transaction, and thus, avoiding massive transmission of records in join operations. The proposed system has been implemented successfully. The preliminary experiments show that the DBDM performs the database partition and migration effectively. Also, the DBDM system is modularly designed to adapt to different database management system (DBMS) or different partition algorithms.
文摘In order to design a new kind of mobile database management system (DBMS)more suitable for mobile computing than the existent DBMS, the essence of database systems in mobilecomputing is analyzed. An opinion is introduced that the mobile database is a kind of dynamicdistributed database, and the concept of virtual servers to translate the clients' mobility to theservers' mobility is proposed. Based on these opinions, a kind of architecture of mobile DBMS, whichis of versatility, is presented. The architecture is composed of a virtual server and a local DBMS,the virtual server is the kernel of the architecture and its functions are described. Eventually,the server kernel of a mobile DBMS prototype is illustrated.
基金supported by the National Natural Science Foundation of China (Grant No. 50539010, 50539110, 50579010, 50539030 and 50809025)
文摘To improve the effectiveness of dam safety monitoring database systems, the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mode. The optimal data model was confirmed by identifying data objects, defining relations and reviewing entities. The conversion of relations among entities to external keys and entities and physical attributes to tables and fields was interpreted completely. On this basis, a multi-dimensional database that reflects the management and analysis of a dam safety monitoring system on monitoring data information has been established, for which factual tables and dimensional tables have been designed. Finally, based on service design and user interface design, the dam safety monitoring system has been developed with Delphi as the development tool. This development project shows that the multi-dimensional database can simplify the development process and minimize hidden dangers in the database structure design. It is superior to other dam safety monitoring system development models and can provide a new research direction for system developers.
文摘Aim To develop a heterogeneous database united system(HDBUS)that combines the local database of Oracle, Sybase and SQL server distributed on different server into a global database,and supports the global transaction management and parallel query over the Intranet Methods In the designing and implementation of HDBUS two important concepts heterogeneous tables join. Results and Conclu- tion The first concept can be used to process the parallel query of multiple database server, the second one is the key technology of heterogeneous is the key technology of heterogeneous distribute database.
基金Project(20030533011)supported by the National Research Foundation for the Doctoral Program of Higher Education of China
文摘A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environments. At the mobile hosts, all transactions perform local pre-validation. The local pre-validation process is carried out against the committed transactions at the server in the last broadcast cycle. Transactions that survive in local pre-validation must be submitted to the server for local final validation. The new protocol eliminates conflicts between mobile read-only and mobile update transactions, and resolves data conflicts flexibly by using multiversion dynamic adjustment of serialization order to avoid unnecessary restarts of transactions. Mobile read-only transactions can be committed with no-blocking, and respond time of mobile read-only transactions is greatly shortened. The tolerance of mobile transactions of disconnections from the broadcast channel is increased. In global validation mobile distributed transactions have to do check to ensure distributed serializability in all participants. The simulation results show that the new concurrency control protocol proposed offers better performance than other protocols in terms of miss rate, restart rate, commit rate. Under high work load (think time is ls) the miss rate of DMVOCC-MVDA is only 14.6%, is significantly lower than that of other protocols. The restart rate of DMVOCC-MVDA is only 32.3%, showing that DMVOCC-MVDA can effectively reduce the restart rate of mobile transactions. And the commit rate of DMVOCC-MVDA is up to 61.2%, which is obviously higher than that of other protocols.
文摘We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel query optimization, transaction processing system and parallel access method in detail.
文摘In this paper,the entity_relation data model for integrating spatio_temporal data is designed.In the design,spatio_temporal data can be effectively stored and spatiao_temporal analysis can be easily realized.
基金Sponsored by the National Natural Science Foundation of China(Grant Nos.61771083,61704015)the Program for Changjiang Scholars and Innovative Research Team in University(Grant No.IRT1299)+3 种基金the Special Fund of Chongqing Key Laboratory(CSTC)Fundamental Science and Frontier Technology Research Project of Chongqing(Grant Nos.cstc2017jcyjAX0380,cstc2015jcyjBX0065)the Scientific and Technological Research Foundation of Chongqing Municipal Education Commission(Grant No.KJ1704083)the University Outstanding Achievement Transformation Project of Chongqing(Grant No.KJZH17117).
文摘Fingerprint⁃based Bluetooth positioning is a popular indoor positioning technology.However,the change of indoor environment and Bluetooth anchor locations has significant impact on signal distribution,which will result in the decline of positioning accuracy.The widespread extension of Bluetooth positioning is limited by the need of manual effort to collect the fingerprints with position labels for fingerprint database construction and updating.To address this problem,this paper presents an adaptive fingerprint database updating approach.First,the crowdsourced data including the Bluetooth Received Signal Strength(RSS)sequences and the speed and heading of the pedestrian were recorded.Second,the recorded crowdsourced data were fused by the Kalman Filtering(KF),and then fed into the trajectory validity analysis model with the purpose of assigning the unlabeled RSS data with position labels to generate candidate fingerprints.Third,after enough candidate fingerprints were obtained at each Reference Point(RP),the Density⁃based Spatial Clustering of Applications with Noise(DBSCAN)approach was conducted on both the original and the candidate fingerprints to filter out the fingerprints which had been identified as the noise,and then the mean of fingerprints in the cluster with the largest data volume was selected as the updated fingerprint of the corresponding RP.Finally,the extensive experimental results show that with the increase of the number of candidate fingerprints and update iterations,the fingerprint⁃based Bluetooth positioning accuracy can be effectively improved.
文摘Schema incompatibility is a major challenge to a federated database systemfor data sharing among heterogeneous,multiple and autonomous databases.This paperpresents a mapping approach based on import schema,export schema and domain conver-sion function,through which schema incompatibility problems such as naming conflict,domain incompatibility and entity definition incompatibility can be resolved effectively.The implementation techniques are also discussed.
文摘An approach to implementing the multimedia database system NHMDB based on NF2(Non-First-Normal-Form) data model is presented. This approach is easily implemented because NF2 structure can efficiently store various media data such as formatted data, text,graphics, image and voice. The main idea is to expand conceptual schema to maintain the consistency of tbreelevel schema in system NHMDB. We developed, implemented and experimented the storage structure and the multimedia data representation by an object identifier.implementation techniques are also discussed.
文摘This paper aims to establish a relative study between a relational Microsoft SQL Server database and a non-relational MongoDB database within the unstructured representation of data in JSON format. There is a great amount of work done regarding comparison of multiple database management applications on the basis of their performances, security etc., but we have limited information available where these databases are assessed on the basis of provided data. This study will mainly focus on looking at all the possibilities that both these database types offer us when handling data in JSON. We will accomplish this by implementing a series of experiments while taking into consideration that the subjected data does not require to be normalized;and therefore, evaluate the outcome to conclude the result.
文摘Data-Base Management System (DBMS) is the current standard for storing information. A DBMS organizes and maintains a structure of storage of data. Databases make it possible to store vast amounts of randomly created information and then retrieve items using associative reasoning in search routines. However, design of databases is cumbersome. If one is to use a database primarily to directly input information, each field must be predefined manually, and the fields must be organized to permit coherent data input. This static requirement is problematic and requires that database table(s) be predefined and customized at the outset, a difficult proposition since current DBMS lack a user friendly front end to allow flexible design of the input model. Furthermore, databases are primarily text based, making it difficult to process graphical data. We have developed a general and nonproprietary approach to the problem of input modeling designed to make use of the known informational architecture to map data to a database and then retrieve the original document in freely editable form. We create form templates using ordinary word processing software: Microsoft InfoPath 2007. Each field in the form is given a unique name identifier in order to be distinguished in the database. It is possible to export text based documents created initially in Microsoft Word by placing a colon at the beginning of any desired field location. InfoPath then captures the preceding string and uses it as the label for the field. Each form can be structured in a way to include any combination of both textual and graphical fields. We input data into InfoPath templates. We then submit the data through a web service to populate fields in an SQL database. By appropriate indexing, we can then recall the entire document from the SQL database for editing, with corresponding audit trail. Graphical data is handled no differently than textual data and is embedded in the database itself permitting direct query approaches. This technique makes it possible for general users to benefit from a combined text-graphical database environment with a flexible non-proprietary interface. Consequently, any template can be effortlessly transformed to a database system and easily recovered in a narrative form.
文摘A distributed processing system (DPS) contains many autonomous nodes, which contribute their own computing power. DPS is considered a unified logical structure, operating in a distributed manner;the processing tasks are divided into fragments and assigned to various nodes for processing. That type of operation requires and involves a great deal of communication. We propose to use the decentralized approach, based on a distributed hash table, to reduce the communication overhead and remove the server unit, thus avoiding having a single point of failure in the system. This paper proposes a mathematical model and algorithms that are implemented in a dedicated experimental system. Using the decentralized approach, this study demonstrates the efficient operation of a decentralized system which results in a reduced energy emission.
基金supported by Universiti Putra Malaysia Grant Scheme(Putra Grant)(GP/2020/9692500).
文摘Data transformation is the core process in migrating database from relational database to NoSQL database such as column-oriented database. However,there is no standard guideline for data transformation from relational database toNoSQL database. A number of schema transformation techniques have been proposed to improve data transformation process and resulted better query processingtime when compared to the relational database query processing time. However,these approaches produced redundant tables in the resulted schema that in turnconsume large unnecessary storage size and produce high query processing timedue to the generated schema with redundant column families in the transformedcolumn-oriented database. In this paper, an efficient data transformation techniquefrom relational database to column-oriented database is proposed. The proposedschema transformation technique is based on the combination of denormalizationapproach, data access pattern and multiple-nested schema. In order to validate theproposed work, the proposed technique is implemented by transforming data fromMySQL database to MongoDB database. A benchmark transformation techniqueis also performed in which the query processing time and the storage size arecompared. Based on the experimental results, the proposed transformation technique showed significant improvement in terms query processing time and storagespace usage due to the reduced number of column families in the column-orienteddatabase.
文摘Computer simulation for materials processing needs a huge database containing a great deal of various physical properties of materials. In order to employ the accumulated large data on materials heat treatment in the past years, it is significant to develop an intelligent database system. Based on the data mining technology for data analysis, an intelligent database web tool system of computer simulation for heat treatment process named as IndBASEweb-HT was built up. The architecture and the arithmetic of this system as well as its application were introduced.
文摘Since the early 1990, significant progress in database technology has provided new platform for emerging new dimensions of data engineering. New models were introduced to utilize the data sets stored in the new generations of databases. These models have a deep impact on evolving decision-support systems. But they suffer a variety of practical problems while accessing real-world data sources. Specifically a type of data storage model based on data distribution theory has been increasingly used in recent years by large-scale enterprises, while it is not compatible with existing decision-support models. This data storage model stores the data in different geographical sites where they are more regularly accessed. This leads to considerably less inter-site data transfer that can reduce data security issues in some circumstances and also significantly improve data manipulation transactions speed. The aim of this paper is to propose a new approach for supporting proactive decision-making that utilizes a workable data source management methodology. The new model can effectively organize and use complex data sources, even when they are distributed in different sites in a fragmented form. At the same time, the new model provides a very high level of intellectual management decision-support by intelligent use of the data collections through utilizing new smart methods in synthesizing useful knowledge. The results of an empirical study to evaluate the model are provided.
基金This work was financially supported by the National Key Research and Development Program of China(2018YFB0703801,2018YFB0703802,2016YFB0701303,and 2016YFB0701304)CRRC Tangshan Co.,Ltd.(201750463031).Special thanks to Professor Hong Wang at Shanghai Jiao Tong University for the fruitful discussions and the constructive suggestions/comments.
文摘A data identifier(DID)is an essential tag or label in all kinds of databases—particularly those related to integrated computational materials engineering(ICME),inheritable integrated intelligent manufacturing(I3M),and the Industrial Internet ofThings.With the guidance and quick acceleration of the developme nt of advanced materials,as envisioned by official documents worldwide,more investigations are required to construct relative numerical standards for material informatics.This work proposes a universal DID format consisting of a set of build chains,which aligns with the classical form of identifier in both international and national standards,such as ISO/IEC 29168-1:2000,GB/T 27766-2011,GA/T 543.2-2011,GM/T 0006-2012,GJB 7365-2011,SL 325-2014,SL 607-201&WS 363.2-2011,and QX/T 39-2005.Each build chain is made up of capital letters and numbers,with no symbols.Moreover,the total length of each build chain is not restricted,which follows the formation of the Universal Coded Character Set in the international standard of ISO/IEC 10646.Based on these rules,the proposed DID is flexible and convenient for extendi ng and sharing in and between various cloud-based platforms.Accordingly,classical two-dimensional(2D)codes,including the Hanxin Code,Lots Perception Matrix(LP)Code,Quick Response(Q.R)code,Grid Matrix(GM)code,and Data Matrix(DM)Code,can be constructed and precisely recognized and/or decoded by either smart phones or specific machines.By utilizing these 2D codes as the fingerprints of a set of data linked with cloud-based platforms,progress and updates in the composition-processing-structure-property-performance workflow process can be tracked spontaneously,paving a path to accelerate the discovery and manufacture of advanced materials and enhance research productivity,performance,and collaboration.
基金supported by the Student Scheme provided by Universiti Kebangsaan Malaysia with the Code TAP-20558.
文摘A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various forms of semi-structured,structured,and unstructured information.These systems use a flat architecture and run different types of data analytics.NoSQL databases are nontabular and store data in a different manner than the relational table.NoSQL databases come in various forms,including key-value pairs,documents,wide columns,and graphs,each based on its data model.They offer simpler scalability and generally outperform traditional relational databases.While NoSQL databases can store diverse data types,they lack full support for atomicity,consistency,isolation,and durability features found in relational databases.Consequently,employing machine learning approaches becomes necessary to categorize complex structured query language(SQL)queries.Results indicate that the most frequently used automatic classification technique in processing SQL queries on NoSQL databases is machine learning-based classification.Overall,this study provides an overview of the automatic classification techniques used in processing SQL queries on NoSQL databases.Understanding these techniques can aid in the development of effective and efficient NoSQL database applications.
文摘An outsource database is a database service provided by cloud computing companies.Using the outsource database can reduce the hardware and software's cost and also get more efficient and reliable data processing capacity.However,the outsource database still has some challenges.If the service provider does not have sufficient confidence,there is the possibility of data leakage.The data may has user's privacy,so data leakage may cause data privacy leak.Based on this factor,to protect the privacy of data in the outsource database becomes very important.In the past,scholars have proposed k-anonymity to protect data privacy in the database.It lets data become anonymous to avoid data privacy leak.But k-anonymity has some problems,it is irreversible,and easier to be attacked by homogeneity attack and background knowledge attack.Later on,scholars have proposed some studies to solve homogeneity attack and background knowledge attack.But their studies still cannot recover back to the original data.In this paper,we propose a data anonymity method.It can be reversible and also prevent those two attacks.Our study is based on the proposed r-transform.It can be used on the numeric type of attributes in the outsource database.In the experiment,we discussed the time required to anonymize and recover data.Furthermore,we investigated the defense against homogeneous attack and background knowledge attack.At the end,we summarized the proposed method and future researches.
文摘The data tree table is a type of data structure consisting of data tree and table, which has a wide field of applications. The visual and dynamic growing algorithm of data tree table and its software method are presented based on the theory of the data structure and visual technology of software. The method of the expression and management of data tree table with relational mode are explored.