期刊文献+
共找到22篇文章
< 1 2 >
每页显示 20 50 100
Scheduling transactions in mobile distributed real-time database systems 被引量:1
1
作者 雷向东 赵跃龙 +1 位作者 陈松乔 袁晓莉 《Journal of Central South University of Technology》 EI 2008年第4期545-551,共7页
A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environment... A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environments. At the mobile hosts, all transactions perform local pre-validation. The local pre-validation process is carried out against the committed transactions at the server in the last broadcast cycle. Transactions that survive in local pre-validation must be submitted to the server for local final validation. The new protocol eliminates conflicts between mobile read-only and mobile update transactions, and resolves data conflicts flexibly by using multiversion dynamic adjustment of serialization order to avoid unnecessary restarts of transactions. Mobile read-only transactions can be committed with no-blocking, and respond time of mobile read-only transactions is greatly shortened. The tolerance of mobile transactions of disconnections from the broadcast channel is increased. In global validation mobile distributed transactions have to do check to ensure distributed serializability in all participants. The simulation results show that the new concurrency control protocol proposed offers better performance than other protocols in terms of miss rate, restart rate, commit rate. Under high work load (think time is ls) the miss rate of DMVOCC-MVDA is only 14.6%, is significantly lower than that of other protocols. The restart rate of DMVOCC-MVDA is only 32.3%, showing that DMVOCC-MVDA can effectively reduce the restart rate of mobile transactions. And the commit rate of DMVOCC-MVDA is up to 61.2%, which is obviously higher than that of other protocols. 展开更多
关键词 mobile distributed real-time database systems muliversion optimistic concurrency control multiversion dynamic adjustment pre-validation multiversion data broadcast
下载PDF
Use of Component Integration Services in Multidatabase Systems: A Feasible Solution for Integrating Academic Institutions or Commercial Industries
2
作者 Mohammad Ghulam Ali Mohammad Ghulam Murtuza 《Journal of Software Engineering and Applications》 2023年第11期561-585,共25页
The book chapter is an extended version of the research paper entitled “Use of Component Integration Services in Multidatabase Systems”, which is presented and published by the 13<sup>th</sup> ISITA, the... The book chapter is an extended version of the research paper entitled “Use of Component Integration Services in Multidatabase Systems”, which is presented and published by the 13<sup>th</sup> ISITA, the National Conference of Recent Trends in Mathematical and Computer Sciences, T.M.B. University, Bhagalpur, India, January 3-4, 2015. Information is widely distributed across many remote, distributed, and autonomous databases (local component databases) in heterogeneous formats. The integration of heterogeneous remote databases is a difficult task, and it has already been addressed by several projects to certain extents. In this chapter, we have discussed how to integrate heterogeneous distributed local relational databases because of their simplicity, excellent security, performance, power, flexibility, data independence, support for new hardware technologies, and spread across the globe. We have also discussed how to constitute a global conceptual schema in the multidatabase system using Sybase Adaptive Server Enterprise’s Component Integration Services (CIS) and OmniConnect. This is feasible for higher education institutions and commercial industries as well. Considering the higher educational institutions, the CIS will improve IT integration for educational institutions with their subsidiaries or with other institutions within the country and abroad in terms of educational management, teaching, learning, and research, including promoting international students’ academic integration, collaboration, and governance. This will prove an innovative strategy to support the modernization and large expansion of academic institutions. This will be considered IT-institutional alignment within a higher education context. This will also support achieving one of the sustainable development goals set by the United Nations: “Goal 4: ensure inclusive and quality education for all and promote lifelong learning”. However, the process of IT integration into higher educational institutions must be thoroughly evaluated, identifying the vital data access points. In this chapter, Section 1 provides an introduction, including the evolution of various database systems, data models, and the emergence of multidatabase systems and their importance. Section 2 discusses component integration services (CIS), OmniConnect and considering heterogeneous relational distributed local databases from the perspective of academics, Section 3 discusses the Sybase Adaptive Server Enterprise (ASE), Section 4 discusses the role of component integration services and OmniConnect of Sybase ASE under the Multidatabase System, Section 5 shows the database architectural framework, Section 6 provides an implementation overview of the global conceptual schema in the multidatabase system, Section 7 discusses query processing in the CIS, and finally, Section 8 concludes the chapter. The chapter will help our students a lot, as we have discussed well the evolution of databases and data models and the emergence of multidatabases. Since some additional useful information is cited, the source of information for each citation is properly mentioned in the references column. 展开更多
关键词 Relational database Component Integration Services OmniConnect MULTIdatabase Global Conceptual Schema distributed database Local Conceptual Schema database Integration IT Integration Higher Education Commercial Industries HEIs
下载PDF
A dynamic crash recovery scheme for distributed real-time database systems
3
作者 肖迎元 刘云生 +2 位作者 刘小峰 廖国琼 王洪亚 《Journal of Shanghai University(English Edition)》 CAS 2006年第6期510-516,共7页
Recovery performance in the event of failures is very important for distributed real-time database systems. This paper presents a time-cognizant logging-based crash recovery scheme (TCLCRS) that aims at distributed ... Recovery performance in the event of failures is very important for distributed real-time database systems. This paper presents a time-cognizant logging-based crash recovery scheme (TCLCRS) that aims at distributed real-time databases, which adopts a main memory database as its ground support. In our scheme, each site maintains a real-time log for local transactions and the subtransactions, which execute at the site, and execte local checkpointing independently. Log records are stored in non-volatile high- speed store, which is divided into four different partitions based on transaction classes. During restart recovery after a site crash, partitioned crash recovery strategy is adopted to ensure that the site can be brought up before the entire local secondary database is reloaded in main memory. The partitioned crash recovery strategy not only guarantees the internal consistency to be recovered, but also guarantee the temporal consistency and recovery of the sates of physical world influenced by uncommitted transactions. Combined with two- phase commit protocol, TCLCRS can guarantee failure atomicity of distributed real-time transactions. 展开更多
关键词 distributed real-time database system partitioned real-time logging partitioned crash recovery.
下载PDF
Weak Serializable Concurrency Control in Distributed Real-Time Database Systems
4
作者 党德鹏 刘云生 潘琳 《Journal of Shanghai University(English Edition)》 CAS 2002年第4期325-330,共6页
Most of the proposed concurrency control protocols for real time database systems are based on serializability theorem. Owing to the unique characteristics of real time database applications and the importance of sa... Most of the proposed concurrency control protocols for real time database systems are based on serializability theorem. Owing to the unique characteristics of real time database applications and the importance of satisfying the timing constraints of transactions, serializability is too strong as a correctness criterion and not suitable for real time databases in most cases. On the other hand, relaxed serializability including epsilon serializability and similarity serializability can allow more real time transactions to satisfy their timing constraints, but database consistency may be sacrificed to some extent. We thus propose the use of weak serializability(WSR) that is more relaxed than conflicting serializability while database consistency is maintained. In this paper, we first formally define the new notion of correctness called weak serializability. After the necessary and sufficient conditions for weak serializability are shown, corresponding concurrency control protocol WDHP(weak serializable distributed high priority protocol) is outlined for distributed real time databases, where a new lock mode called mask lock mode is proposed for simplifying the condition of global consistency. Finally, through a series of simulation studies, it is shown that using the new concurrency control protocol the performance of distributed real time databases can be greatly improved. 展开更多
关键词 distributed real time database systems relaxed serializability real time concurrency control read only transactions.
下载PDF
Quasi Serializable Concurrency Control in Distributed Real-Time Database Systems
5
作者 党德鹏 Liu Yunsheng 《High Technology Letters》 EI CAS 2003年第1期72-76,共5页
This paper formally defines and analyses the new notion of correctness called quasi serializability, and then outlines corresponding concurrency control protocol QDHP for distributed real-time databases. Finally, thro... This paper formally defines and analyses the new notion of correctness called quasi serializability, and then outlines corresponding concurrency control protocol QDHP for distributed real-time databases. Finally, through a series of simulation studies, it shows that using the new concurrency control protocol the performance of distributed real-time databases can be much improved. 展开更多
关键词 distributed real-time database systems relaxing serializability real-tirne concurrency control
下载PDF
DCAD:a Dual Clustering Algorithm for Distributed Spatial Databases 被引量:15
6
作者 ZHOU Jiaogen GUAN Jihong LI Pingxiang 《Geo-Spatial Information Science》 2007年第2期137-144,共8页
Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically... Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clustering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted clustering is obtained based on those features fective and efficient. Second, local features from each site are sent to a central site where global Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient. 展开更多
关键词 distributed clustering dual clustering distributed spatial database
下载PDF
An architecture for mobile database management system 被引量:2
7
作者 Dong Li and Yucai Feng Computer School, Huazhong University of Science and Technology, Wuhan 430074, China 《Journal of University of Science and Technology Beijing》 CSCD 2002年第2期156-160,共5页
In order to design a new kind of mobile database management system (DBMS)more suitable for mobile computing than the existent DBMS, the essence of database systems in mobilecomputing is analyzed. An opinion is introdu... In order to design a new kind of mobile database management system (DBMS)more suitable for mobile computing than the existent DBMS, the essence of database systems in mobilecomputing is analyzed. An opinion is introduced that the mobile database is a kind of dynamicdistributed database, and the concept of virtual servers to translate the clients' mobility to theservers' mobility is proposed. Based on these opinions, a kind of architecture of mobile DBMS, whichis of versatility, is presented. The architecture is composed of a virtual server and a local DBMS,the virtual server is the kernel of the architecture and its functions are described. Eventually,the server kernel of a mobile DBMS prototype is illustrated. 展开更多
关键词 mobile database dynamic distributed database DBMS ARCHITECTURE virtual server data region
下载PDF
A Distribution Management System for Relational Databases in Cloud Environments
8
作者 Sze-Yao Li Chun-Ming Chang +3 位作者 Yuan-Yu Tsai Seth Chen Jonathan Tsai Wen-Lung Tsai 《Journal of Electronic Science and Technology》 CAS 2013年第2期169-175,共7页
For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and dura... For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and durable) properties. Moreover, database partition and migration tools can help transplanting conventional relational database systems to the cloud environment rather than rebuilding a new system. This paper proposes a database distribution management (DBDM) system, which partitions or replicates the data according to the transaction behaviors of the application system. The principle strategy of DBDM is to keep together the data used in a single transaction, and thus, avoiding massive transmission of records in join operations. The proposed system has been implemented successfully. The preliminary experiments show that the DBDM performs the database partition and migration effectively. Also, the DBDM system is modularly designed to adapt to different database management system (DBMS) or different partition algorithms. 展开更多
关键词 Data migration database partition distributed database relational database.
下载PDF
A New Approach for Knowledge Discovery in Distributed Databases Using Fragmented Data Storage Model
9
作者 Masoud Pesaran Behbahani Islam Choudhury Souheil Khaddaj 《Chinese Business Review》 2013年第12期834-845,共12页
Since the early 1990, significant progress in database technology has provided new platform for emerging new dimensions of data engineering. New models were introduced to utilize the data sets stored in the new genera... Since the early 1990, significant progress in database technology has provided new platform for emerging new dimensions of data engineering. New models were introduced to utilize the data sets stored in the new generations of databases. These models have a deep impact on evolving decision-support systems. But they suffer a variety of practical problems while accessing real-world data sources. Specifically a type of data storage model based on data distribution theory has been increasingly used in recent years by large-scale enterprises, while it is not compatible with existing decision-support models. This data storage model stores the data in different geographical sites where they are more regularly accessed. This leads to considerably less inter-site data transfer that can reduce data security issues in some circumstances and also significantly improve data manipulation transactions speed. The aim of this paper is to propose a new approach for supporting proactive decision-making that utilizes a workable data source management methodology. The new model can effectively organize and use complex data sources, even when they are distributed in different sites in a fragmented form. At the same time, the new model provides a very high level of intellectual management decision-support by intelligent use of the data collections through utilizing new smart methods in synthesizing useful knowledge. The results of an empirical study to evaluate the model are provided. 展开更多
关键词 data mining decision-support system distributed databases knowledge discovery in database (KDD)
下载PDF
A Distributed DBMS Based Dynamic Programming Method for Query Optimization
10
作者 孙纪舟 李阳 +2 位作者 蒋志勇 顾云苏 何清法 《Journal of Donghua University(English Edition)》 EI CAS 2012年第1期55-58,共4页
Dynamic programming(DP) is an effective query optimization approach to select an appropriate join order for relational database management system(RDBMS) in multi-table joins. This method was extended and made availabl... Dynamic programming(DP) is an effective query optimization approach to select an appropriate join order for relational database management system(RDBMS) in multi-table joins. This method was extended and made available in distributed DBMS(D-DBMS). The structure of this optimal solution was firstly characterized according to the distributing status of tables and data, and then the recurrence relations between a problem and its sub-problems were recursively defined. DP in D-DBMS has the same time-complexity with that in centralized DBMS, while it has the capability to solve a much more sophisticated optimal problem of multi-table join in D-DBMS. The effectiveness of this optimal strategy has been proved by experiments. 展开更多
关键词 distributed database dynamic programming (DP) multitable loin: auery optimization
下载PDF
Designing a Model to Study Data Mining in Distributed Environment
11
作者 Md. Abadur Rahman Masud Karim 《Journal of Data Analysis and Information Processing》 2021年第1期23-29,共7页
To make business policy, market analysis, corporate decision, fraud detection, etc., we have to analyze and work with huge amount of data. Generally, such data are taken from different sources. Researchers are using d... To make business policy, market analysis, corporate decision, fraud detection, etc., we have to analyze and work with huge amount of data. Generally, such data are taken from different sources. Researchers are using data mining to perform such tasks. Data mining techniques are used to find hidden information from large data source. Data mining is using for various fields: Artificial intelligence, Bank, health and medical, corruption, legal issues, corporate business, marketing, etc. Special interest is given to associate rules, data mining algorithms, decision tree and distributed approach. Data is becoming larger and spreading geographically. So it is difficult to find better result from only a central data source. For knowledge discovery, we have to work with distributed database. On the other hand, security and privacy considerations are also another factor for de-motivation of working with centralized data. For this reason, distributed database is essential for future processing. In this paper, we have proposed a framework to study data mining in distributed environment. The paper presents a framework to bring out actionable knowledge. We have shown some level by which we can generate actionable knowledge. Possible tools and technique for these levels are discussed. 展开更多
关键词 Data Mining distributed database Knowledge Discovery Classification Algorithm
下载PDF
Scalable and adaptive log manager in distributed systems
12
作者 Huan ZHOU Weining QIAN +3 位作者 Xuan ZHOU Qiwen DONG Aoying ZHOU Wenrong TAN 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第2期45-62,共18页
On-line transaction processing(OLTP)systems rely on transaction logging and quorum-based consensus protocol to guarantee durability,high availability and strong consistency.This makes the log manager a key component o... On-line transaction processing(OLTP)systems rely on transaction logging and quorum-based consensus protocol to guarantee durability,high availability and strong consistency.This makes the log manager a key component of distributed database management systems(DDBMSs).The leader of DDBMSs commonly adopts a centralized logging method to writing log entries into a stable storage device and uses a constant log replication strategy to periodically synchronize its state to followers.With the advent of new hardware and high parallelism of transaction processing,the traditional centralized design of logging limits scalability,and the constant trigger condition of replication can not always maintain optimal performance under dynamic workloads.In this paper,we propose a new log manager named Salmo with scalable logging and adaptive replication for distributed database systems.The scalable logging eliminates centralized contention by utilizing a highly concurrent data structure and speedy log hole tracking.The kernel of adaptive replication is an adaptive log shipping method,which dynamically adjusts the number of log entries transmitted between leader and followers based on the real-time workload.We implemented and evaluated Salmo in the open-sourced transaction processing systems Cedar and DBx1000.Experimental results show that Salmo scales well by increasing the number of working threads,improves peak throughput by 1.56×and reduces latency by more than 4×over log replication of Raft,and maintains efficient and stable performance under dynamic workloads all the time. 展开更多
关键词 distributed database systems transaction log-ging log replication SCALABLE ADAPTIVE
原文传递
Improved Integrity Constraints Checking in Distributed Databases by Exploiting Local Checking 被引量:2
13
作者 Ali A.Alwan Hamidah Ibrahim Nur Izura Udzir 《Journal of Computer Science & Technology》 SCIE EI CSCD 2009年第4期665-674,共10页
Most of the previous studies concerning checking the integrity constraints in distributed database derive simplified forms of the initial integrity constraints with the sufficiency property, since the sufficient test ... Most of the previous studies concerning checking the integrity constraints in distributed database derive simplified forms of the initial integrity constraints with the sufficiency property, since the sufficient test is known to be cheaper than the complete test and its initial integrity constraint as it involves less data to be transferred across the network and can always be evaluated at the target site (single site). Their studies are limited as they depend strictly on the assumption that an update operation will be executed at a site where the relation specified in the update operation is located, which is not always true. Hence, the sufficient test, which is proven to be local test by previous study, is no longer appropriate. This paper proposes an approach to checking integrity constraints in a distributed database by utilizing as much as possible the local information stored at the target site. The proposed approach derives support tests as an alternative to the existing complete and sufficient tests proposed by previous researchers with the intention to increase the number of local checking regardless the location of the submitted update operation. Several analyses have been performed to evaluate the proposed approach, and the results show that support tests can benefit the distributed database, where local constraint checking can be achieved. 展开更多
关键词 distributed database integrity constraint checking integrity constraint integrity test
原文传递
Trustworthiness Technologies of DDSS 被引量:1
14
作者 TIAN Junfeng XIAO Bing +1 位作者 WANG Zixian ZHANG Yuzhu 《Wuhan University Journal of Natural Sciences》 CAS 2006年第6期1853-1856,共4页
The most significant strategic development in information technology over the past years has been "trusted computing" and trusted computers have been produced. In this paper trusted mechanisms adopted by PC is impor... The most significant strategic development in information technology over the past years has been "trusted computing" and trusted computers have been produced. In this paper trusted mechanisms adopted by PC is imported into distributed system, such as chain of trust, trusted root and so on. Based on distributed database server system (DDSS), a novel model of trusted distributed database server system (TDDSS) is presented ultimately. In TDDSS role-based access control, two-level of logs and other technologies are adopted to ensure the trustworthiness of the system. 展开更多
关键词 trusted computing trusted distributed database trusted authorization chain of trust
下载PDF
The Catalog Management Strategy of Distributed Data Base Systems
15
作者 周龙骧 秦箕英 《Journal of Computer Science & Technology》 SCIE EI CSCD 1994年第3期193-203,共11页
In this paper the catalog management strategy of the successfully integrating andrunning DDBMS C-POREL is summarized. The new catalog management strategyand its implementation scheme are based on the analysis of the c... In this paper the catalog management strategy of the successfully integrating andrunning DDBMS C-POREL is summarized. The new catalog management strategyand its implementation scheme are based on the analysis of the catalog managementmethods of the pioneer DDBMS. The goal of the new strategy is to improve the systemefficiency. Analysis and practice show that this strategy is successful. 展开更多
关键词 distributed database management systems (DDBMS) catalog man-agement RECOVERY concurrency control
原文传递
An adaptive strategy for statistics collecting in distributed database
16
作者 Jintao Gao Wenjie Liu Zhanhuai Li 《Frontiers of Computer Science》 SCIE EI CSCD 2020年第5期199-211,共13页
Collecting statistics is a time-and resource-consuming operation in database systems.It is even more challenging to efficiently collect statistics without affecting system performance,meanwhile keeping correctness in ... Collecting statistics is a time-and resource-consuming operation in database systems.It is even more challenging to efficiently collect statistics without affecting system performance,meanwhile keeping correctness in distributed database.Traditional strategies usually consider one dimension during collecting statistics,which is lack of adaptiveness.In this paper,we propose an adaptive strategy for statistics collecting(ASC),which well balances collecting efficiency,correctness of statistics and effect to system performance.We formally define the procedure of collecting statistics and abstract the relationships among collecting efficiency,correctness of statistics and effect to system performance,and introduce an elastic structure(ESI)storing necessary information generated during proceeding our strategy.ASC can pick appropriate time to trigger collecting action and filter unnecessary tasks,meanwhile reasonably allocating collecting tasks to appropriate executing locations with right executing models through the information stored at ESI.We implement and evaluate our strategy in a distributed database.Experiments show that our solutions generally improve the efficiency and correctness of collecting statistics,moreover,reduce the negative effect to system performance comparing with other strategies. 展开更多
关键词 statistics collecting distributed database adaptive strategy query optimization
原文传递
CIMBASE: A Multidatabase Integration Environment 被引量:1
17
作者 王令赤 周立柱 +1 位作者 王国仁 郑怀远 《Tsinghua Science and Technology》 SCIE EI CAS 1996年第2期146-152,共7页
The integration of heterogeneous multidatabases on a network is one of the key issues to be sofved in thedevelopment of CIMS (computer integrated manufacturing system). As a solution to this issue, a multidatabase int... The integration of heterogeneous multidatabases on a network is one of the key issues to be sofved in thedevelopment of CIMS (computer integrated manufacturing system). As a solution to this issue, a multidatabase integration environment, CIMBASE, has been developed. CIMBASE adopts an object-oriented data model and provides users with a series of software tools: a query language, a pre-compiler, a graphical database schema editor,a graphical query interface and a form based query generation tool.This paper discusses in detail the major aspectsof CIMBASE: its object-oriented data model, query language interpreter and the design and implementation of itsprc-compiler. The design and algorithms presented in this paper provide a solid foundation for research on multidatabase integration. 展开更多
关键词 object-oriented database distributed database data model query language pre-compiler CIMS
原文传递
TIFAflow: Enhancing Traffic Archiving System with Flow Granularity for Forensic Analysis in Network Security 被引量:3
18
作者 Zhen Chen Linyun Ruan +2 位作者 Junwei Cao Yifan Yu Xin Jiang 《Tsinghua Science and Technology》 SCIE EI CAS 2013年第4期406-417,共12页
The archiving of Internet traffic is an essential function for retrospective network event analysis and forensic computer communication. The state-of-the-art approach for network monitoring and analysis involves stora... The archiving of Internet traffic is an essential function for retrospective network event analysis and forensic computer communication. The state-of-the-art approach for network monitoring and analysis involves storage and analysis of network flow statistic. However, this approach loses much valuable information within the Internet traffic. With the advancement of commodity hardware, in particular the volume of storage devices and the speed of interconnect technologies used in network adapter cards and multi-core processors, it is now possible to capture 10 Gbps and beyond real-time network traffic using a commodity computer, such as n2disk. Also with the advancement of distributed file system (such as Hadoop, ZFS, etc.) and open cloud computing platform (such as OpenStack, CloudStack, and Eucalyptus, etc.), it is practical to store such large volume of traffic data and fully in-depth analyse the inside communication within an acceptable latency. In this paper, based on well- known TimeMachine, we present TIFAflow, the design and implementation of a novel system for archiving and querying network flows. Firstly, we enhance the traffic archiving system named TImemachine+FAstbit (TIFA) with flow granularity, i.e., supply the system with flow table and flow module. Secondly, based on real network traces, we conduct performance comparison experiments of TIFAflow with other implementations such as common database solution, TimeMachine and TIFA system. Finally, based on comparison results, we demonstrate that TIFAflow has a higher performance improvement in storing and querying performance than TimeMachine and TIFA, both in time and space metrics. 展开更多
关键词 network security traffic archival forensic analysis phishing attack bitmap database hadoop distributed file system cloud computing NoSQL
原文传递
Big data storage technologies: a survey 被引量:17
19
作者 Aisha SIDDIQA Ahmad KARIM Abdullah GANI 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2017年第8期1040-1070,共31页
There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed 'big data'. The structural shift of the storage m... There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed 'big data'. The structural shift of the storage mechanism from traditional data management systems to NoSQL technology is due to the intention of fulfilling big data storage requirements. However, the available big data storage technologies are inefficient to provide consistent, scalable, and available solutions for continuously growing heterogeneous data. Storage is the preliminary process of big data analytics for real-world applications such as scientific experiments, healthcare, social networks, and e-business. So far, Amazon, Google, and Apache are some of the industry standards in providing big data storage solutions, yet the literature does not report an in-depth survey of storage technologies available for big data, investigating the performance and magnitude gains of these technologies. The primary objective of this paper is to conduct a comprehensive investigation of state-of-the-art storage technologies available for big data. A well-defined taxonomy of big data storage technologies is presented to assist data analysts and researchers in understanding and selecting a storage mecha- nism that better fits their needs. To evaluate the performance of different storage architectures, we compare and analyze the ex- isling approaches using Brewer's CAP theorem. The significance and applications of storage technologies and support to other categories are discussed. Several future research challenges are highlighted with the intention to expedite the deployment of a reliable and scalable storage system. 展开更多
关键词 Big data Big data storage NoSQL databases distributed databases CAP theorem SCALABILITY Consistency-partition resilience Availability-partition resilience
原文传递
GDM: A New Graph Based Data Model Using Functional Abstractionx 被引量:1
20
作者 Sankhayan Choudhury Nabendu Chaki Swapan Bhattacharya 《Journal of Computer Science & Technology》 SCIE EI CSCD 2006年第3期430-438,共9页
In this paper, a Graph-based semantic Data Model (GDM) is proposed with the primary objective of bridging the gap between the human perception of an enterprise and the needs of computing infrastructure to organize i... In this paper, a Graph-based semantic Data Model (GDM) is proposed with the primary objective of bridging the gap between the human perception of an enterprise and the needs of computing infrastructure to organize information in some particular manner for efficient storage and retrieval. The Graph Data Model (GDM) has been proposed as an alternative data model to combine the advantages of the relational model with the positive features of semantic data models. The proposed GDM offers a structural representation for interacting to the designer, making it always easy to comprehend the complex relations amongst basic data items. GDM allows an entire database to be viewed as a Graph (V, E) in a layered organization. Here, a graph is created in a bottom up fashion where V represents the basic instances of data or a functionally abstracted module, called primary semantic group (PSG) and secondary semantic group (SSG). An edge in the model implies the relationship among the secondary semantic groups. The contents of the lowest layer are the semantically grouped data values in the form of primary semantic groups. The SSGs are nothing but the higher-level abstraction and are created by the method of encapsulation of various PSGs, SSGs and basic data elements. This encapsulation methodology to provide a higher-level abstraction continues generating various secondary semantic groups until the designer thinks that it is sufficient to declare the actual problem domain. GDM, thus, uses standard abstractions available in a semantic data model with a structural representation in terms of a graph. The operations on the data model are formalized in the proposed graph algebra. A Graph Query Language (GQL) is also developed, maintaining similarity with the widely accepted user-friendly SQL. Finally, the paper also presents the methodology to make this GDM compatible with the distributed environment, and a corresponding query processing technique for distributed environment is also suggested for the sake of completeness. 展开更多
关键词 graph data model semantic group semantic data model distributed database fragmentation and allocation schema
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部