Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically...Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clustering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted clustering is obtained based on those features fective and efficient. Second, local features from each site are sent to a central site where global Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.展开更多
A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environment...A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environments. At the mobile hosts, all transactions perform local pre-validation. The local pre-validation process is carried out against the committed transactions at the server in the last broadcast cycle. Transactions that survive in local pre-validation must be submitted to the server for local final validation. The new protocol eliminates conflicts between mobile read-only and mobile update transactions, and resolves data conflicts flexibly by using multiversion dynamic adjustment of serialization order to avoid unnecessary restarts of transactions. Mobile read-only transactions can be committed with no-blocking, and respond time of mobile read-only transactions is greatly shortened. The tolerance of mobile transactions of disconnections from the broadcast channel is increased. In global validation mobile distributed transactions have to do check to ensure distributed serializability in all participants. The simulation results show that the new concurrency control protocol proposed offers better performance than other protocols in terms of miss rate, restart rate, commit rate. Under high work load (think time is ls) the miss rate of DMVOCC-MVDA is only 14.6%, is significantly lower than that of other protocols. The restart rate of DMVOCC-MVDA is only 32.3%, showing that DMVOCC-MVDA can effectively reduce the restart rate of mobile transactions. And the commit rate of DMVOCC-MVDA is up to 61.2%, which is obviously higher than that of other protocols.展开更多
Recovery performance in the event of failures is very important for distributed real-time database systems. This paper presents a time-cognizant logging-based crash recovery scheme (TCLCRS) that aims at distributed ...Recovery performance in the event of failures is very important for distributed real-time database systems. This paper presents a time-cognizant logging-based crash recovery scheme (TCLCRS) that aims at distributed real-time databases, which adopts a main memory database as its ground support. In our scheme, each site maintains a real-time log for local transactions and the subtransactions, which execute at the site, and execte local checkpointing independently. Log records are stored in non-volatile high- speed store, which is divided into four different partitions based on transaction classes. During restart recovery after a site crash, partitioned crash recovery strategy is adopted to ensure that the site can be brought up before the entire local secondary database is reloaded in main memory. The partitioned crash recovery strategy not only guarantees the internal consistency to be recovered, but also guarantee the temporal consistency and recovery of the sates of physical world influenced by uncommitted transactions. Combined with two- phase commit protocol, TCLCRS can guarantee failure atomicity of distributed real-time transactions.展开更多
Most of the proposed concurrency control protocols for real time database systems are based on serializability theorem. Owing to the unique characteristics of real time database applications and the importance of sa...Most of the proposed concurrency control protocols for real time database systems are based on serializability theorem. Owing to the unique characteristics of real time database applications and the importance of satisfying the timing constraints of transactions, serializability is too strong as a correctness criterion and not suitable for real time databases in most cases. On the other hand, relaxed serializability including epsilon serializability and similarity serializability can allow more real time transactions to satisfy their timing constraints, but database consistency may be sacrificed to some extent. We thus propose the use of weak serializability(WSR) that is more relaxed than conflicting serializability while database consistency is maintained. In this paper, we first formally define the new notion of correctness called weak serializability. After the necessary and sufficient conditions for weak serializability are shown, corresponding concurrency control protocol WDHP(weak serializable distributed high priority protocol) is outlined for distributed real time databases, where a new lock mode called mask lock mode is proposed for simplifying the condition of global consistency. Finally, through a series of simulation studies, it is shown that using the new concurrency control protocol the performance of distributed real time databases can be greatly improved.展开更多
Since the early 1990, significant progress in database technology has provided new platform for emerging new dimensions of data engineering. New models were introduced to utilize the data sets stored in the new genera...Since the early 1990, significant progress in database technology has provided new platform for emerging new dimensions of data engineering. New models were introduced to utilize the data sets stored in the new generations of databases. These models have a deep impact on evolving decision-support systems. But they suffer a variety of practical problems while accessing real-world data sources. Specifically a type of data storage model based on data distribution theory has been increasingly used in recent years by large-scale enterprises, while it is not compatible with existing decision-support models. This data storage model stores the data in different geographical sites where they are more regularly accessed. This leads to considerably less inter-site data transfer that can reduce data security issues in some circumstances and also significantly improve data manipulation transactions speed. The aim of this paper is to propose a new approach for supporting proactive decision-making that utilizes a workable data source management methodology. The new model can effectively organize and use complex data sources, even when they are distributed in different sites in a fragmented form. At the same time, the new model provides a very high level of intellectual management decision-support by intelligent use of the data collections through utilizing new smart methods in synthesizing useful knowledge. The results of an empirical study to evaluate the model are provided.展开更多
The efficiency and performance of Distributed Database Management Systems (DDBMS) is mainly measured by its proper design and by network communication cost between sites. Fragmentation and distribution of data are the...The efficiency and performance of Distributed Database Management Systems (DDBMS) is mainly measured by its proper design and by network communication cost between sites. Fragmentation and distribution of data are the major design issues of the DDBMS. In this paper, we propose new approach that integrates both fragmentation and data allocation in one strategy based on high performance clustering technique and transaction processing cost functions. This new approach achieves efficiently and effectively the objectives of data fragmentation, data allocation and network sites clustering. The approach splits the data relations into pair-wise disjoint fragments and determine whether each fragment has to be allocated or not in the network sites, where allocation benefit outweighs the cost depending on high performance clustering technique. To show the performance of the proposed approach, we performed experimental studies on real database application at different networks connectivity. The obtained results proved to achieve minimum total data transaction costs between different sites, reduced the amount of redundant data to be accessed between these sites and improved the overall DDBMS performance.展开更多
This paper formally defines and analyses the new notion of correctness called quasi serializability, and then outlines corresponding concurrency control protocol QDHP for distributed real-time databases. Finally, thro...This paper formally defines and analyses the new notion of correctness called quasi serializability, and then outlines corresponding concurrency control protocol QDHP for distributed real-time databases. Finally, through a series of simulation studies, it shows that using the new concurrency control protocol the performance of distributed real-time databases can be much improved.展开更多
The traditional method first classifies the user information and combines the query method to retrieve the interest information, but neglects to calculate the weight of the user interest information, which leads to th...The traditional method first classifies the user information and combines the query method to retrieve the interest information, but neglects to calculate the weight of the user interest information, which leads to the low retrieval accuracy. A retrieval method based on the fuzzy proximity classification technology is proposed. Approximation between the fuzzy sets is used to represent the consistency between the user interest information features, and the consistency calculation formula and the skewness confidence matrix between the user interest information features are given. The fuzzy classification of the user interest information can obtain the most consistent confidence data and eliminate the redundant approximation interference data. The probabilistic model of the information word frequency and the user interest information length calculates the weight of the user interest information, and adjusts the weight formula constantly.展开更多
For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and dura...For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and durable) properties. Moreover, database partition and migration tools can help transplanting conventional relational database systems to the cloud environment rather than rebuilding a new system. This paper proposes a database distribution management (DBDM) system, which partitions or replicates the data according to the transaction behaviors of the application system. The principle strategy of DBDM is to keep together the data used in a single transaction, and thus, avoiding massive transmission of records in join operations. The proposed system has been implemented successfully. The preliminary experiments show that the DBDM performs the database partition and migration effectively. Also, the DBDM system is modularly designed to adapt to different database management system (DBMS) or different partition algorithms.展开更多
As the typical peer-to-peer distributed networks, blockchain systemsrequire each node to copy a complete transaction database, so as to ensure newtransactions can by verified independently. In a blockchain system (e.g...As the typical peer-to-peer distributed networks, blockchain systemsrequire each node to copy a complete transaction database, so as to ensure newtransactions can by verified independently. In a blockchain system (e.g., bitcoinsystem), the node does not rely on any central organization, and every node keepsan entire copy of the transaction database. However, this feature determines thatthe size of blockchain transaction database is growing rapidly. Therefore, with thecontinuous system operations, the node memory also needs to be expanded tosupport the system running. Especially in the big data era, the increasing networktraffic will lead to faster transaction growth rate. This paper analyzes blockchaintransaction databases and proposes a storage optimization scheme. The proposedscheme divides blockchain transaction database into cold zone and hot zone usingexpiration recognition method based on Least Recently Used (LRU) algorithm. Itcan achieve storage optimization by moving unspent transaction outputs outsidethe in-memory transaction databases. We present the theoretical analysis on theoptimization method to validate the effectiveness. Extensive experiments showour proposed method outperforms the current mechanism for the blockchaintransaction databases.展开更多
In order to design a new kind of mobile database management system (DBMS)more suitable for mobile computing than the existent DBMS, the essence of database systems in mobilecomputing is analyzed. An opinion is introdu...In order to design a new kind of mobile database management system (DBMS)more suitable for mobile computing than the existent DBMS, the essence of database systems in mobilecomputing is analyzed. An opinion is introduced that the mobile database is a kind of dynamicdistributed database, and the concept of virtual servers to translate the clients' mobility to theservers' mobility is proposed. Based on these opinions, a kind of architecture of mobile DBMS, whichis of versatility, is presented. The architecture is composed of a virtual server and a local DBMS,the virtual server is the kernel of the architecture and its functions are described. Eventually,the server kernel of a mobile DBMS prototype is illustrated.展开更多
This paper presents a case study on structure design and establishment of database application system for alien species in Shandong Province, integrating with Geographic Information System, computer network, and datab...This paper presents a case study on structure design and establishment of database application system for alien species in Shandong Province, integrating with Geographic Information System, computer network, and database technology to the research of alien species. The modules of alien species database, including classified data input, statistics and analysis, species pictures and distribution maps, and out date input, were approached by Visual Studio.net 2003 and Microsoft SQL server 2000. The alien species information contains the information of classification, species distinction characteristics, biological characteristics, original area, distribution area, the entering fashion and route, invasion time, invasion reason, interaction with the endemic species, growth state, danger state and spatial information, i.e. distribution map. Based on the above bases, several models including application, checking, modifying, printing, adding and returning models were developed. Furthermore, through the establishment of index tables and index maps, we can also spatially query the data like picture, text and GIS map data. This research established the technological platform of sharing information about scientific resource of alien species in Shandong Province, offering the basis for the dynamic inquiry of alien species, the warning technology of prevention and the fast reaction system. The database application system possessed the principles of good practicability, friendly user interface and convenient usage. It can supply full and accurate information inquiry services of alien species for the users and provide functions of dynamically managing the database for the administrator.展开更多
To make business policy, market analysis, corporate decision, fraud detection, etc., we have to analyze and work with huge amount of data. Generally, such data are taken from different sources. Researchers are using d...To make business policy, market analysis, corporate decision, fraud detection, etc., we have to analyze and work with huge amount of data. Generally, such data are taken from different sources. Researchers are using data mining to perform such tasks. Data mining techniques are used to find hidden information from large data source. Data mining is using for various fields: Artificial intelligence, Bank, health and medical, corruption, legal issues, corporate business, marketing, etc. Special interest is given to associate rules, data mining algorithms, decision tree and distributed approach. Data is becoming larger and spreading geographically. So it is difficult to find better result from only a central data source. For knowledge discovery, we have to work with distributed database. On the other hand, security and privacy considerations are also another factor for de-motivation of working with centralized data. For this reason, distributed database is essential for future processing. In this paper, we have proposed a framework to study data mining in distributed environment. The paper presents a framework to bring out actionable knowledge. We have shown some level by which we can generate actionable knowledge. Possible tools and technique for these levels are discussed.展开更多
Enhanced understanding of how sampling techniques affect estimates of the global U-Pb age-distribution have, in turn, constrained U-Pb database design. Recent studies indicate that each continent has a unique age-dist...Enhanced understanding of how sampling techniques affect estimates of the global U-Pb age-distribution have, in turn, constrained U-Pb database design. Recent studies indicate that each continent has a unique age-distribution, as determined by zircon ages dated by the U-Pb isotope method. Likewise, broad regions within a continent also exhibit diverse age-distributions. To achieve a reliable estimate of the global distribution, the heterogenous composition of the continental crust requires sampling as many regions as feasibly possible. To attain this goal, and to provide a method for calculating age histograms, the records from a recent global U-Pb compilation are supplemented with 281,631 new records. These additions increase the database size to 700,598 records. In addition, the data are now restructured and made available as a relational database. After filtering the records by the six age-models included with the database, the results reveal two problems that might generally be unrecognized. First, an abrupt switch in the best-age at any given point(such as 1000 Ma) from ^(206)Pb/^(238)U ages to ^(207)Pb/^(206)Pb ages artificially depresses the age-distribution at the cutoff point. Second, rejecting analyses based on either absolute discordance or the magnitude of 2σ precision errors artificially depresses the age-distribution between 900 Ma and 2000 Ma. The results indicate that, when estimating the global U-Pb age-distribution, the methods for determining best-age and for rejecting records both require some attention. Possible solutions include using either an Accuracy Model or a Precision Model for estimating best-age, and then including all U-Pb records in the estimate, rather than rejecting any of them.展开更多
Dynamic programming(DP) is an effective query optimization approach to select an appropriate join order for relational database management system(RDBMS) in multi-table joins. This method was extended and made availabl...Dynamic programming(DP) is an effective query optimization approach to select an appropriate join order for relational database management system(RDBMS) in multi-table joins. This method was extended and made available in distributed DBMS(D-DBMS). The structure of this optimal solution was firstly characterized according to the distributing status of tables and data, and then the recurrence relations between a problem and its sub-problems were recursively defined. DP in D-DBMS has the same time-complexity with that in centralized DBMS, while it has the capability to solve a much more sophisticated optimal problem of multi-table join in D-DBMS. The effectiveness of this optimal strategy has been proved by experiments.展开更多
Electronic patient data gives many advantages,but also new difficulties.Deadlocks may delay procedures like acquiring patient information.Distributed deadlock resolution solutions introduce uncertainty due to inaccura...Electronic patient data gives many advantages,but also new difficulties.Deadlocks may delay procedures like acquiring patient information.Distributed deadlock resolution solutions introduce uncertainty due to inaccurate transaction properties.Soft computing-based solutions have been developed to solve this challenge.In a single framework,ambiguous,vague,incomplete,and inconsistent transaction attribute information has received minimal attention.The work presented in this paper employed type-2 neutrosophic logic,an extension of type-1 neutrosophic logic,to handle uncertainty in real-time deadlock-resolving systems.The proposed method is structured to reflect multiple types of knowledge and relations among transactions’features that include validation factor degree,slackness degree,degree of deadline-missed transaction based on the degree of membership of truthiness,degree ofmembership of indeterminacy,and degree ofmembership of falsity.Here,the footprint of uncertainty(FOU)for truth,indeterminacy,and falsity represents the level of uncertainty that exists in the value of a grade of membership.We employed a distributed real-time transaction processing simulator(DRTTPS)to conduct the simulations and conducted experiments using the benchmark Pima Indians diabetes dataset(PIDD).As the results showed,there is an increase in detection rate and a large drop in rollback rate when this new strategy is used.The performance of Type-2 neutrosophicbased resolution is better than the Type-1 neutrosophic-based approach on the execution ratio scale.The improvement rate has reached 10%to 20%,depending on the number of arrived transactions.展开更多
The book chapter is an extended version of the research paper entitled “Use of Component Integration Services in Multidatabase Systems”, which is presented and published by the 13<sup>th</sup> ISITA, the...The book chapter is an extended version of the research paper entitled “Use of Component Integration Services in Multidatabase Systems”, which is presented and published by the 13<sup>th</sup> ISITA, the National Conference of Recent Trends in Mathematical and Computer Sciences, T.M.B. University, Bhagalpur, India, January 3-4, 2015. Information is widely distributed across many remote, distributed, and autonomous databases (local component databases) in heterogeneous formats. The integration of heterogeneous remote databases is a difficult task, and it has already been addressed by several projects to certain extents. In this chapter, we have discussed how to integrate heterogeneous distributed local relational databases because of their simplicity, excellent security, performance, power, flexibility, data independence, support for new hardware technologies, and spread across the globe. We have also discussed how to constitute a global conceptual schema in the multidatabase system using Sybase Adaptive Server Enterprise’s Component Integration Services (CIS) and OmniConnect. This is feasible for higher education institutions and commercial industries as well. Considering the higher educational institutions, the CIS will improve IT integration for educational institutions with their subsidiaries or with other institutions within the country and abroad in terms of educational management, teaching, learning, and research, including promoting international students’ academic integration, collaboration, and governance. This will prove an innovative strategy to support the modernization and large expansion of academic institutions. This will be considered IT-institutional alignment within a higher education context. This will also support achieving one of the sustainable development goals set by the United Nations: “Goal 4: ensure inclusive and quality education for all and promote lifelong learning”. However, the process of IT integration into higher educational institutions must be thoroughly evaluated, identifying the vital data access points. In this chapter, Section 1 provides an introduction, including the evolution of various database systems, data models, and the emergence of multidatabase systems and their importance. Section 2 discusses component integration services (CIS), OmniConnect and considering heterogeneous relational distributed local databases from the perspective of academics, Section 3 discusses the Sybase Adaptive Server Enterprise (ASE), Section 4 discusses the role of component integration services and OmniConnect of Sybase ASE under the Multidatabase System, Section 5 shows the database architectural framework, Section 6 provides an implementation overview of the global conceptual schema in the multidatabase system, Section 7 discusses query processing in the CIS, and finally, Section 8 concludes the chapter. The chapter will help our students a lot, as we have discussed well the evolution of databases and data models and the emergence of multidatabases. Since some additional useful information is cited, the source of information for each citation is properly mentioned in the references column.展开更多
The periphery of the Qinghai-Tibet Plateau is renowned for its susceptibility to landslides.However,the northwestern margin of this region,characterised by limited human activities and challenging transportation,remai...The periphery of the Qinghai-Tibet Plateau is renowned for its susceptibility to landslides.However,the northwestern margin of this region,characterised by limited human activities and challenging transportation,remains insufficiently explored concerning landslide occurrence and dispersion.With the planning and construction of the Xinjiang-Tibet Railway,a comprehensive investigation into disastrous landslides in this area is essential for effective disaster preparedness and mitigation strategies.By using the human-computer interaction interpretation approach,the authors established a landslide database encompassing 13003 landslides,collectively spanning an area of 3351.24 km^(2)(36°N-40°N,73°E-78°E).The database incorporates diverse topographical and environmental parameters,including regional elevation,slope angle,slope aspect,distance to faults,distance to roads,distance to rivers,annual precipitation,and stratum.The statistical characteristics of number and area of landslides,landslide number density(LND),and landslide area percentage(LAP)are analyzed.The authors found that a predominant concentration of landslide origins within high slope angle regions,with the highest incidence observed in intervals characterised by average slopes of 20°to 30°,maximum slope angle above 80°,along with orientations towards the north(N),northeast(NE),and southwest(SW).Additionally,elevations above 4.5 km,distance to rivers below 1 km,rainfall between 20-30 mm and 30-40 mm emerge as particularly susceptible to landslide development.The study area’s geological composition primarily comprises Mesozoic and Upper Paleozoic outcrops.Both fault and human engineering activities have different degrees of influence on landslide development.Furthermore,the significance of the landslide database,the relationship between landslide distribution and environmental factors,and the geometric and morphological characteristics of landslides are discussed.The landslide H/L ratios in the study area are mainly concentrated between 0.4 and 0.64.It means the landslides mobility in the region is relatively low,and the authors speculate that landslides in this region more possibly triggered by earthquakes or located in meizoseismal area.展开更多
基金Funded by the National 973 Program of China (No.2003CB415205)the National Natural Science Foundation of China (No.40523005, No.60573183, No.60373019)the Open Research Fund Program of LIESMARS (No.WKL(04)0303).
文摘Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clustering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted clustering is obtained based on those features fective and efficient. Second, local features from each site are sent to a central site where global Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.
基金Project(20030533011)supported by the National Research Foundation for the Doctoral Program of Higher Education of China
文摘A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environments. At the mobile hosts, all transactions perform local pre-validation. The local pre-validation process is carried out against the committed transactions at the server in the last broadcast cycle. Transactions that survive in local pre-validation must be submitted to the server for local final validation. The new protocol eliminates conflicts between mobile read-only and mobile update transactions, and resolves data conflicts flexibly by using multiversion dynamic adjustment of serialization order to avoid unnecessary restarts of transactions. Mobile read-only transactions can be committed with no-blocking, and respond time of mobile read-only transactions is greatly shortened. The tolerance of mobile transactions of disconnections from the broadcast channel is increased. In global validation mobile distributed transactions have to do check to ensure distributed serializability in all participants. The simulation results show that the new concurrency control protocol proposed offers better performance than other protocols in terms of miss rate, restart rate, commit rate. Under high work load (think time is ls) the miss rate of DMVOCC-MVDA is only 14.6%, is significantly lower than that of other protocols. The restart rate of DMVOCC-MVDA is only 32.3%, showing that DMVOCC-MVDA can effectively reduce the restart rate of mobile transactions. And the commit rate of DMVOCC-MVDA is up to 61.2%, which is obviously higher than that of other protocols.
基金Project supported by National Natural Science Foundation ofChina (Grant No .60203017) Defense Pre-research Projectof the"Tenth Five-Year-Plan"of China (Grant No .413150403)
文摘Recovery performance in the event of failures is very important for distributed real-time database systems. This paper presents a time-cognizant logging-based crash recovery scheme (TCLCRS) that aims at distributed real-time databases, which adopts a main memory database as its ground support. In our scheme, each site maintains a real-time log for local transactions and the subtransactions, which execute at the site, and execte local checkpointing independently. Log records are stored in non-volatile high- speed store, which is divided into four different partitions based on transaction classes. During restart recovery after a site crash, partitioned crash recovery strategy is adopted to ensure that the site can be brought up before the entire local secondary database is reloaded in main memory. The partitioned crash recovery strategy not only guarantees the internal consistency to be recovered, but also guarantee the temporal consistency and recovery of the sates of physical world influenced by uncommitted transactions. Combined with two- phase commit protocol, TCLCRS can guarantee failure atomicity of distributed real-time transactions.
文摘Most of the proposed concurrency control protocols for real time database systems are based on serializability theorem. Owing to the unique characteristics of real time database applications and the importance of satisfying the timing constraints of transactions, serializability is too strong as a correctness criterion and not suitable for real time databases in most cases. On the other hand, relaxed serializability including epsilon serializability and similarity serializability can allow more real time transactions to satisfy their timing constraints, but database consistency may be sacrificed to some extent. We thus propose the use of weak serializability(WSR) that is more relaxed than conflicting serializability while database consistency is maintained. In this paper, we first formally define the new notion of correctness called weak serializability. After the necessary and sufficient conditions for weak serializability are shown, corresponding concurrency control protocol WDHP(weak serializable distributed high priority protocol) is outlined for distributed real time databases, where a new lock mode called mask lock mode is proposed for simplifying the condition of global consistency. Finally, through a series of simulation studies, it is shown that using the new concurrency control protocol the performance of distributed real time databases can be greatly improved.
文摘Since the early 1990, significant progress in database technology has provided new platform for emerging new dimensions of data engineering. New models were introduced to utilize the data sets stored in the new generations of databases. These models have a deep impact on evolving decision-support systems. But they suffer a variety of practical problems while accessing real-world data sources. Specifically a type of data storage model based on data distribution theory has been increasingly used in recent years by large-scale enterprises, while it is not compatible with existing decision-support models. This data storage model stores the data in different geographical sites where they are more regularly accessed. This leads to considerably less inter-site data transfer that can reduce data security issues in some circumstances and also significantly improve data manipulation transactions speed. The aim of this paper is to propose a new approach for supporting proactive decision-making that utilizes a workable data source management methodology. The new model can effectively organize and use complex data sources, even when they are distributed in different sites in a fragmented form. At the same time, the new model provides a very high level of intellectual management decision-support by intelligent use of the data collections through utilizing new smart methods in synthesizing useful knowledge. The results of an empirical study to evaluate the model are provided.
文摘The efficiency and performance of Distributed Database Management Systems (DDBMS) is mainly measured by its proper design and by network communication cost between sites. Fragmentation and distribution of data are the major design issues of the DDBMS. In this paper, we propose new approach that integrates both fragmentation and data allocation in one strategy based on high performance clustering technique and transaction processing cost functions. This new approach achieves efficiently and effectively the objectives of data fragmentation, data allocation and network sites clustering. The approach splits the data relations into pair-wise disjoint fragments and determine whether each fragment has to be allocated or not in the network sites, where allocation benefit outweighs the cost depending on high performance clustering technique. To show the performance of the proposed approach, we performed experimental studies on real database application at different networks connectivity. The obtained results proved to achieve minimum total data transaction costs between different sites, reduced the amount of redundant data to be accessed between these sites and improved the overall DDBMS performance.
基金the National Natural Science Foundation of China and the Commission of Science,Technokgy and Industry for National Defense
文摘This paper formally defines and analyses the new notion of correctness called quasi serializability, and then outlines corresponding concurrency control protocol QDHP for distributed real-time databases. Finally, through a series of simulation studies, it shows that using the new concurrency control protocol the performance of distributed real-time databases can be much improved.
文摘The traditional method first classifies the user information and combines the query method to retrieve the interest information, but neglects to calculate the weight of the user interest information, which leads to the low retrieval accuracy. A retrieval method based on the fuzzy proximity classification technology is proposed. Approximation between the fuzzy sets is used to represent the consistency between the user interest information features, and the consistency calculation formula and the skewness confidence matrix between the user interest information features are given. The fuzzy classification of the user interest information can obtain the most consistent confidence data and eliminate the redundant approximation interference data. The probabilistic model of the information word frequency and the user interest information length calculates the weight of the user interest information, and adjusts the weight formula constantly.
基金supported by the Taiwan Ministry of Economic Affairs and Institute for Information Industry under the project titled "Fundamental Industrial Technology Development Program (1/4)"
文摘For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and durable) properties. Moreover, database partition and migration tools can help transplanting conventional relational database systems to the cloud environment rather than rebuilding a new system. This paper proposes a database distribution management (DBDM) system, which partitions or replicates the data according to the transaction behaviors of the application system. The principle strategy of DBDM is to keep together the data used in a single transaction, and thus, avoiding massive transmission of records in join operations. The proposed system has been implemented successfully. The preliminary experiments show that the DBDM performs the database partition and migration effectively. Also, the DBDM system is modularly designed to adapt to different database management system (DBMS) or different partition algorithms.
基金supported by Researchers Supporting Project(No.RSP-2020/102)King Saud University,Riyadh,Saudi Arabiathe National Natural Science Foundation of China(Nos.61802031,61772454,61811530332,61811540410)+4 种基金the Natural Science Foundation of Hunan Province,China(No.2019JGYB177)the Research Foundation of Education Bureau of Hunan Province,China(No.18C0216)the“Practical Innovation and Entrepreneurial Ability Improvement Plan”for Professional Degree Graduate students of Changsha University of Science and Technology(No.SJCX201971)Hunan Graduate Scientific Research Innovation Project,China(No.CX2019694)This work is also supported by the Programs of Transformation and Upgrading of Industries and Information Technologies of Jiangsu Province(No.JITC-1900AX2038/01).
文摘As the typical peer-to-peer distributed networks, blockchain systemsrequire each node to copy a complete transaction database, so as to ensure newtransactions can by verified independently. In a blockchain system (e.g., bitcoinsystem), the node does not rely on any central organization, and every node keepsan entire copy of the transaction database. However, this feature determines thatthe size of blockchain transaction database is growing rapidly. Therefore, with thecontinuous system operations, the node memory also needs to be expanded tosupport the system running. Especially in the big data era, the increasing networktraffic will lead to faster transaction growth rate. This paper analyzes blockchaintransaction databases and proposes a storage optimization scheme. The proposedscheme divides blockchain transaction database into cold zone and hot zone usingexpiration recognition method based on Least Recently Used (LRU) algorithm. Itcan achieve storage optimization by moving unspent transaction outputs outsidethe in-memory transaction databases. We present the theoretical analysis on theoptimization method to validate the effectiveness. Extensive experiments showour proposed method outperforms the current mechanism for the blockchaintransaction databases.
文摘In order to design a new kind of mobile database management system (DBMS)more suitable for mobile computing than the existent DBMS, the essence of database systems in mobilecomputing is analyzed. An opinion is introduced that the mobile database is a kind of dynamicdistributed database, and the concept of virtual servers to translate the clients' mobility to theservers' mobility is proposed. Based on these opinions, a kind of architecture of mobile DBMS, whichis of versatility, is presented. The architecture is composed of a virtual server and a local DBMS,the virtual server is the kernel of the architecture and its functions are described. Eventually,the server kernel of a mobile DBMS prototype is illustrated.
基金this study was supported by the Key Project of Natu-ral Science Foundation of Shandong Province(No. Z2003D05)Key Project of Environmental Protection Science of Shandong Province(No. 2004057)Outstanding Young Scientists Grants of Shandong Province(No.2005BS08010),China
文摘This paper presents a case study on structure design and establishment of database application system for alien species in Shandong Province, integrating with Geographic Information System, computer network, and database technology to the research of alien species. The modules of alien species database, including classified data input, statistics and analysis, species pictures and distribution maps, and out date input, were approached by Visual Studio.net 2003 and Microsoft SQL server 2000. The alien species information contains the information of classification, species distinction characteristics, biological characteristics, original area, distribution area, the entering fashion and route, invasion time, invasion reason, interaction with the endemic species, growth state, danger state and spatial information, i.e. distribution map. Based on the above bases, several models including application, checking, modifying, printing, adding and returning models were developed. Furthermore, through the establishment of index tables and index maps, we can also spatially query the data like picture, text and GIS map data. This research established the technological platform of sharing information about scientific resource of alien species in Shandong Province, offering the basis for the dynamic inquiry of alien species, the warning technology of prevention and the fast reaction system. The database application system possessed the principles of good practicability, friendly user interface and convenient usage. It can supply full and accurate information inquiry services of alien species for the users and provide functions of dynamically managing the database for the administrator.
文摘To make business policy, market analysis, corporate decision, fraud detection, etc., we have to analyze and work with huge amount of data. Generally, such data are taken from different sources. Researchers are using data mining to perform such tasks. Data mining techniques are used to find hidden information from large data source. Data mining is using for various fields: Artificial intelligence, Bank, health and medical, corruption, legal issues, corporate business, marketing, etc. Special interest is given to associate rules, data mining algorithms, decision tree and distributed approach. Data is becoming larger and spreading geographically. So it is difficult to find better result from only a central data source. For knowledge discovery, we have to work with distributed database. On the other hand, security and privacy considerations are also another factor for de-motivation of working with centralized data. For this reason, distributed database is essential for future processing. In this paper, we have proposed a framework to study data mining in distributed environment. The paper presents a framework to bring out actionable knowledge. We have shown some level by which we can generate actionable knowledge. Possible tools and technique for these levels are discussed.
文摘Enhanced understanding of how sampling techniques affect estimates of the global U-Pb age-distribution have, in turn, constrained U-Pb database design. Recent studies indicate that each continent has a unique age-distribution, as determined by zircon ages dated by the U-Pb isotope method. Likewise, broad regions within a continent also exhibit diverse age-distributions. To achieve a reliable estimate of the global distribution, the heterogenous composition of the continental crust requires sampling as many regions as feasibly possible. To attain this goal, and to provide a method for calculating age histograms, the records from a recent global U-Pb compilation are supplemented with 281,631 new records. These additions increase the database size to 700,598 records. In addition, the data are now restructured and made available as a relational database. After filtering the records by the six age-models included with the database, the results reveal two problems that might generally be unrecognized. First, an abrupt switch in the best-age at any given point(such as 1000 Ma) from ^(206)Pb/^(238)U ages to ^(207)Pb/^(206)Pb ages artificially depresses the age-distribution at the cutoff point. Second, rejecting analyses based on either absolute discordance or the magnitude of 2σ precision errors artificially depresses the age-distribution between 900 Ma and 2000 Ma. The results indicate that, when estimating the global U-Pb age-distribution, the methods for determining best-age and for rejecting records both require some attention. Possible solutions include using either an Accuracy Model or a Precision Model for estimating best-age, and then including all U-Pb records in the estimate, rather than rejecting any of them.
文摘Dynamic programming(DP) is an effective query optimization approach to select an appropriate join order for relational database management system(RDBMS) in multi-table joins. This method was extended and made available in distributed DBMS(D-DBMS). The structure of this optimal solution was firstly characterized according to the distributing status of tables and data, and then the recurrence relations between a problem and its sub-problems were recursively defined. DP in D-DBMS has the same time-complexity with that in centralized DBMS, while it has the capability to solve a much more sophisticated optimal problem of multi-table join in D-DBMS. The effectiveness of this optimal strategy has been proved by experiments.
文摘Electronic patient data gives many advantages,but also new difficulties.Deadlocks may delay procedures like acquiring patient information.Distributed deadlock resolution solutions introduce uncertainty due to inaccurate transaction properties.Soft computing-based solutions have been developed to solve this challenge.In a single framework,ambiguous,vague,incomplete,and inconsistent transaction attribute information has received minimal attention.The work presented in this paper employed type-2 neutrosophic logic,an extension of type-1 neutrosophic logic,to handle uncertainty in real-time deadlock-resolving systems.The proposed method is structured to reflect multiple types of knowledge and relations among transactions’features that include validation factor degree,slackness degree,degree of deadline-missed transaction based on the degree of membership of truthiness,degree ofmembership of indeterminacy,and degree ofmembership of falsity.Here,the footprint of uncertainty(FOU)for truth,indeterminacy,and falsity represents the level of uncertainty that exists in the value of a grade of membership.We employed a distributed real-time transaction processing simulator(DRTTPS)to conduct the simulations and conducted experiments using the benchmark Pima Indians diabetes dataset(PIDD).As the results showed,there is an increase in detection rate and a large drop in rollback rate when this new strategy is used.The performance of Type-2 neutrosophicbased resolution is better than the Type-1 neutrosophic-based approach on the execution ratio scale.The improvement rate has reached 10%to 20%,depending on the number of arrived transactions.
文摘The book chapter is an extended version of the research paper entitled “Use of Component Integration Services in Multidatabase Systems”, which is presented and published by the 13<sup>th</sup> ISITA, the National Conference of Recent Trends in Mathematical and Computer Sciences, T.M.B. University, Bhagalpur, India, January 3-4, 2015. Information is widely distributed across many remote, distributed, and autonomous databases (local component databases) in heterogeneous formats. The integration of heterogeneous remote databases is a difficult task, and it has already been addressed by several projects to certain extents. In this chapter, we have discussed how to integrate heterogeneous distributed local relational databases because of their simplicity, excellent security, performance, power, flexibility, data independence, support for new hardware technologies, and spread across the globe. We have also discussed how to constitute a global conceptual schema in the multidatabase system using Sybase Adaptive Server Enterprise’s Component Integration Services (CIS) and OmniConnect. This is feasible for higher education institutions and commercial industries as well. Considering the higher educational institutions, the CIS will improve IT integration for educational institutions with their subsidiaries or with other institutions within the country and abroad in terms of educational management, teaching, learning, and research, including promoting international students’ academic integration, collaboration, and governance. This will prove an innovative strategy to support the modernization and large expansion of academic institutions. This will be considered IT-institutional alignment within a higher education context. This will also support achieving one of the sustainable development goals set by the United Nations: “Goal 4: ensure inclusive and quality education for all and promote lifelong learning”. However, the process of IT integration into higher educational institutions must be thoroughly evaluated, identifying the vital data access points. In this chapter, Section 1 provides an introduction, including the evolution of various database systems, data models, and the emergence of multidatabase systems and their importance. Section 2 discusses component integration services (CIS), OmniConnect and considering heterogeneous relational distributed local databases from the perspective of academics, Section 3 discusses the Sybase Adaptive Server Enterprise (ASE), Section 4 discusses the role of component integration services and OmniConnect of Sybase ASE under the Multidatabase System, Section 5 shows the database architectural framework, Section 6 provides an implementation overview of the global conceptual schema in the multidatabase system, Section 7 discusses query processing in the CIS, and finally, Section 8 concludes the chapter. The chapter will help our students a lot, as we have discussed well the evolution of databases and data models and the emergence of multidatabases. Since some additional useful information is cited, the source of information for each citation is properly mentioned in the references column.
基金supported by the National Key Research and Development Program of China(2021YFB3901205)National Institute of Natural Hazards,Ministry of Emergency Management of China(2023-JBKY-57)。
文摘The periphery of the Qinghai-Tibet Plateau is renowned for its susceptibility to landslides.However,the northwestern margin of this region,characterised by limited human activities and challenging transportation,remains insufficiently explored concerning landslide occurrence and dispersion.With the planning and construction of the Xinjiang-Tibet Railway,a comprehensive investigation into disastrous landslides in this area is essential for effective disaster preparedness and mitigation strategies.By using the human-computer interaction interpretation approach,the authors established a landslide database encompassing 13003 landslides,collectively spanning an area of 3351.24 km^(2)(36°N-40°N,73°E-78°E).The database incorporates diverse topographical and environmental parameters,including regional elevation,slope angle,slope aspect,distance to faults,distance to roads,distance to rivers,annual precipitation,and stratum.The statistical characteristics of number and area of landslides,landslide number density(LND),and landslide area percentage(LAP)are analyzed.The authors found that a predominant concentration of landslide origins within high slope angle regions,with the highest incidence observed in intervals characterised by average slopes of 20°to 30°,maximum slope angle above 80°,along with orientations towards the north(N),northeast(NE),and southwest(SW).Additionally,elevations above 4.5 km,distance to rivers below 1 km,rainfall between 20-30 mm and 30-40 mm emerge as particularly susceptible to landslide development.The study area’s geological composition primarily comprises Mesozoic and Upper Paleozoic outcrops.Both fault and human engineering activities have different degrees of influence on landslide development.Furthermore,the significance of the landslide database,the relationship between landslide distribution and environmental factors,and the geometric and morphological characteristics of landslides are discussed.The landslide H/L ratios in the study area are mainly concentrated between 0.4 and 0.64.It means the landslides mobility in the region is relatively low,and the authors speculate that landslides in this region more possibly triggered by earthquakes or located in meizoseismal area.