Recent studies have addressed that the cache be havior is important in the design of main memory index structures. Cache-conscious indices such as the CSB^+-tree are shown to outperform conventional main memory indic...Recent studies have addressed that the cache be havior is important in the design of main memory index structures. Cache-conscious indices such as the CSB^+-tree are shown to outperform conventional main memory indices such as the AVL-tree and the T-tree. This paper proposes a cacheconscious version of the T-tree, CST-tree, defined according to the cache-conscious definition. To separate the keys within a node into two parts, the CST-tree can gain higher cache hit ratio.展开更多
This paper presents an efficient recovery scheme suitable for real-time mainmemory database. In the recovery scheme, log records are stored in non-volatile RAM which is dividedinto four different partitions based on t...This paper presents an efficient recovery scheme suitable for real-time mainmemory database. In the recovery scheme, log records are stored in non-volatile RAM which is dividedinto four different partitions based on transaction types. Similarly, a main memory database isdivided into four partitions based data types. When the using ratio of log store area exceeds thethreshold value, checkpoint procedure is triggered. During executing checkpoint procedure, someuseless log records are deleted. During restart recovery after a crash, partition reloading policyis adopted to assure that critical data are reloaded and restored in advance, so that the databasesystem can be brought up before the entire database is reloaded into main memory. Therefore downtime is obvionsly reduced. Simulation experiments show our recovery scheme obviously improves thesystem performance, and does a favor to meet the dtadlints of real-time transactions.展开更多
In order to implement semantic mapping of database metasearch engines, a system is proposed, which uses ontology as the organization form of information and records the new words not appearing in the ontology. When th...In order to implement semantic mapping of database metasearch engines, a system is proposed, which uses ontology as the organization form of information and records the new words not appearing in the ontology. When the new word' s frequency of use exceeds the threshold, it is added into the ontology. Ontology expansion is implemented in this way. The search process supports "and" and "or" Boolean operations accordingly. In order to improve the mapping speed of the system, a memory module is added which can memorize the recent query information of users and automatically learn the user' s query interest during the mapping which can dynamically decide the search order of instances tables. Experiments prove that these measures can obviously reduce the average mapping time.展开更多
A new secured database management system architecture using intrusion detection systems(IDS)is proposed in this paper for organizations with no previous role mapping for users.A simple representation of Structured Que...A new secured database management system architecture using intrusion detection systems(IDS)is proposed in this paper for organizations with no previous role mapping for users.A simple representation of Structured Query Language queries is proposed to easily permit the use of the worked clustering algorithm.A new clustering algorithm that uses a tube search with adaptive memory is applied to database log files to create users’profiles.Then,queries issued for each user are checked against the related user profile using a classifier to determine whether or not each query is malicious.The IDS will stop query execution or report the threat to the responsible person if the query is malicious.A simple classifier based on the Euclidean distance is used and the issued query is transformed to the proposed simple representation using a classifier,where the Euclidean distance between the centers and the profile’s issued query is calculated.A synthetic data set is used for our experimental evaluations.Normal user access behavior in relation to the database is modelled using the data set.The false negative(FN)and false positive(FP)rates are used to compare our proposed algorithm with other methods.The experimental results indicate that our proposed method results in very small FN and FP rates.展开更多
With the full development of disk-resident databases(DRDB)in recent years,it is widely used in business and transactional applications.In long-term use,some problems of disk databases are gradually exposed.For applica...With the full development of disk-resident databases(DRDB)in recent years,it is widely used in business and transactional applications.In long-term use,some problems of disk databases are gradually exposed.For applications with high real-time requirements,the performance of using disk database is not satisfactory.In the context of the booming development of the Internet of things,domestic real-time databases have also gradually developed.Still,most of them only support the storage,processing,and analysis of data values with fewer data types,which can not fully meet the current industrial process control system data types,complex sources,fast update speed,and other needs.Facing the business needs of efficient data collection and storage of the Internet of things,this paper optimizes the transaction processing efficiency and data storage performance of the memory database,constructs a lightweight real-time memory database transaction processing and data storage model,realizes a lightweight real-time memory database transaction processing and data storage model,and improves the reliability and efficiency of the database.Through simulation,we proved that the cache hit rate of the cache replacement algorithm proposed in this paper is higher than the traditional LRU(Least Recently Used)algorithm.Using the cache replacement algorithm proposed in this paper can improve the performance of the system cache.展开更多
Storing the whole database in the main-memory is a common method to processreal-time transaction in real-time database systems. The recovery mechanism of Main-memory Real-timeDatabase Systems (MMRTDBS) should reflect ...Storing the whole database in the main-memory is a common method to processreal-time transaction in real-time database systems. The recovery mechanism of Main-memory Real-timeDatabase Systems (MMRTDBS) should reflect the characteristics of the main-memory database andreal-time database because their structures are quite different from other conventional databasesystems. In this paper, therefore, we propose a multi-level recovery mechanism for main-memoryreal-time database systems with Extendable Chained Bucket Hashing (ECBH). Owing to the occurrence ofreal-time data in real-time systems, we should also consider it in our recovery mechanism .According to our performance test, this mechanism can improve the transaction concurrency, reducingtransactions ' deadline missing rate.展开更多
Holter usually monitors electrocardiogram(ECG)signals for more than 24 hours to capture short-lived cardiac abnormalities.In view of the large amount of Holter data and the fact that the normal part accounts for the m...Holter usually monitors electrocardiogram(ECG)signals for more than 24 hours to capture short-lived cardiac abnormalities.In view of the large amount of Holter data and the fact that the normal part accounts for the majority,it is reasonable to design an algorithm that can automatically eliminate normal data segments as much as possible without missing any abnormal data segments,and then take the left segments to the doctors or the computer programs for further diagnosis.In this paper,we propose a preliminary abnormal segment screening method for Holter data.Based on long short-term memory(LSTM)networks,the prediction model is established and trained with the normal data of a monitored object.Then,on the basis of kernel density estimation,we learn the distribution law of prediction errors after applying the trained LSTM model to the regular data.Based on these,the preliminary abnormal ECG segment screening analysis is carried out without R wave detection.Experiments on the MIT-BIH arrhythmia database show that,under the condition of ensuring that no abnormal point is missed,53.89% of normal segments can be effectively obviated.This work can greatly reduce the workload of subsequent further processing.展开更多
As the speed gap between main memory and modern processors continues to widen, the cache behavior becomes more important for main memory database systems (MMDBs). Indexing technique is a key component of MMDBs. Unfo...As the speed gap between main memory and modern processors continues to widen, the cache behavior becomes more important for main memory database systems (MMDBs). Indexing technique is a key component of MMDBs. Unfortunately, the predominant indexes -B^+-trees and T-trees -- have been shown to utilize cache poorly, which triggers the development of many cache-conscious indexes, such as CSB^+-trees and pB^+-trees. Most of these cache-conscious indexes are variants of conventional B^+-trees, and have better cache performance than B^+-trees. In this paper, we develop a novel J^+-tree index, inspired by the Judy structure which is an associative array data structure, and propose a more cacheoptimized index -- Prefetching J^+-tree (pJ^+-tree), which applies prefetching to J^+-tree to accelerate range scan operations. The J^+-tree stores all the keys in its leaf nodes and keeps the reference values of leaf nodes in a Judy structure, which makes J^+-tree not only hold the advantages of Judy (such as fast single value search) but also outperform it in other aspects. For example, J^+-trees can achieve better performance on range queries than Judy. The pJ^+-tree index exploits prefetching techniques to further improve the cache behavior of J^+-trees and yields a speedup of 2.0 on range scans. Compared with B^+-trees, CSB^+-trees, pB^+-trees and T-trees, our extensive experimental Study shows that pJ^+-trees can provide better performance on both time (search, scan, update) and space aspects.展开更多
We have witnessed exciting development of RAM technology in the past decade. The memory size grows rapidly and the price continues to decrease, so that it is fea- sible to deploy large amounts of RAM in a computer sys...We have witnessed exciting development of RAM technology in the past decade. The memory size grows rapidly and the price continues to decrease, so that it is fea- sible to deploy large amounts of RAM in a computer system. Several companies and research institutions have devoted a lot of resources to develop in-memory databases (IMDB) that implement queries after loading data into (virtual) memory in advance. The bloom of various in-memory databases pursues us to test and evaluate their performance objectively and fairly. Although the existing database benchmarks like Wisconsin benchmark and TPC-X series have achieved great success, they cannot suit for in-memory databases due to the lack of consideration of unique characteristics of an IMDB. In this study, we propose MemTest, a novel benchmark that concerns some major characteristics of an in-memory database. This benchmark constructs particular metrics, which cover processing time, compression ratio, minimal memory space and column strength of an in-memory database. We design a data model based on inter-bank transaction applications, and a data generator to support uniform and skew data distributions. The MemTest workload includes a set of queries and transactions against the metrics and data model. Finally, we illustrate the efficacy of MemTest through the implementations on two different in-memory databases.展开更多
基金Supported bythe National High Technology of 863Project (2002AA1Z2308 ,2002AA118030)
文摘Recent studies have addressed that the cache be havior is important in the design of main memory index structures. Cache-conscious indices such as the CSB^+-tree are shown to outperform conventional main memory indices such as the AVL-tree and the T-tree. This paper proposes a cacheconscious version of the T-tree, CST-tree, defined according to the cache-conscious definition. To separate the keys within a node into two parts, the CST-tree can gain higher cache hit ratio.
文摘This paper presents an efficient recovery scheme suitable for real-time mainmemory database. In the recovery scheme, log records are stored in non-volatile RAM which is dividedinto four different partitions based on transaction types. Similarly, a main memory database isdivided into four partitions based data types. When the using ratio of log store area exceeds thethreshold value, checkpoint procedure is triggered. During executing checkpoint procedure, someuseless log records are deleted. During restart recovery after a crash, partition reloading policyis adopted to assure that critical data are reloaded and restored in advance, so that the databasesystem can be brought up before the entire database is reloaded into main memory. Therefore downtime is obvionsly reduced. Simulation experiments show our recovery scheme obviously improves thesystem performance, and does a favor to meet the dtadlints of real-time transactions.
文摘In order to implement semantic mapping of database metasearch engines, a system is proposed, which uses ontology as the organization form of information and records the new words not appearing in the ontology. When the new word' s frequency of use exceeds the threshold, it is added into the ontology. Ontology expansion is implemented in this way. The search process supports "and" and "or" Boolean operations accordingly. In order to improve the mapping speed of the system, a memory module is added which can memorize the recent query information of users and automatically learn the user' s query interest during the mapping which can dynamically decide the search order of instances tables. Experiments prove that these measures can obviously reduce the average mapping time.
文摘A new secured database management system architecture using intrusion detection systems(IDS)is proposed in this paper for organizations with no previous role mapping for users.A simple representation of Structured Query Language queries is proposed to easily permit the use of the worked clustering algorithm.A new clustering algorithm that uses a tube search with adaptive memory is applied to database log files to create users’profiles.Then,queries issued for each user are checked against the related user profile using a classifier to determine whether or not each query is malicious.The IDS will stop query execution or report the threat to the responsible person if the query is malicious.A simple classifier based on the Euclidean distance is used and the issued query is transformed to the proposed simple representation using a classifier,where the Euclidean distance between the centers and the profile’s issued query is calculated.A synthetic data set is used for our experimental evaluations.Normal user access behavior in relation to the database is modelled using the data set.The false negative(FN)and false positive(FP)rates are used to compare our proposed algorithm with other methods.The experimental results indicate that our proposed method results in very small FN and FP rates.
基金supported by the National Key R&D Program of China“Key technologies for coordination and interoperation of power distribution service resource”[2021YFB1302400]“Research on Digitization and Intelligent Application of Low-Voltage Power Distribution Equipment”[SGSDDK00PDJS2000375].
文摘With the full development of disk-resident databases(DRDB)in recent years,it is widely used in business and transactional applications.In long-term use,some problems of disk databases are gradually exposed.For applications with high real-time requirements,the performance of using disk database is not satisfactory.In the context of the booming development of the Internet of things,domestic real-time databases have also gradually developed.Still,most of them only support the storage,processing,and analysis of data values with fewer data types,which can not fully meet the current industrial process control system data types,complex sources,fast update speed,and other needs.Facing the business needs of efficient data collection and storage of the Internet of things,this paper optimizes the transaction processing efficiency and data storage performance of the memory database,constructs a lightweight real-time memory database transaction processing and data storage model,realizes a lightweight real-time memory database transaction processing and data storage model,and improves the reliability and efficiency of the database.Through simulation,we proved that the cache hit rate of the cache replacement algorithm proposed in this paper is higher than the traditional LRU(Least Recently Used)algorithm.Using the cache replacement algorithm proposed in this paper can improve the performance of the system cache.
文摘Storing the whole database in the main-memory is a common method to processreal-time transaction in real-time database systems. The recovery mechanism of Main-memory Real-timeDatabase Systems (MMRTDBS) should reflect the characteristics of the main-memory database andreal-time database because their structures are quite different from other conventional databasesystems. In this paper, therefore, we propose a multi-level recovery mechanism for main-memoryreal-time database systems with Extendable Chained Bucket Hashing (ECBH). Owing to the occurrence ofreal-time data in real-time systems, we should also consider it in our recovery mechanism .According to our performance test, this mechanism can improve the transaction concurrency, reducingtransactions ' deadline missing rate.
文摘Holter usually monitors electrocardiogram(ECG)signals for more than 24 hours to capture short-lived cardiac abnormalities.In view of the large amount of Holter data and the fact that the normal part accounts for the majority,it is reasonable to design an algorithm that can automatically eliminate normal data segments as much as possible without missing any abnormal data segments,and then take the left segments to the doctors or the computer programs for further diagnosis.In this paper,we propose a preliminary abnormal segment screening method for Holter data.Based on long short-term memory(LSTM)networks,the prediction model is established and trained with the normal data of a monitored object.Then,on the basis of kernel density estimation,we learn the distribution law of prediction errors after applying the trained LSTM model to the regular data.Based on these,the preliminary abnormal ECG segment screening analysis is carried out without R wave detection.Experiments on the MIT-BIH arrhythmia database show that,under the condition of ensuring that no abnormal point is missed,53.89% of normal segments can be effectively obviated.This work can greatly reduce the workload of subsequent further processing.
基金supported by a grant from HP Lab China,and the National Natural Science Foundation of China under Grant Nos.60496325 and 60573092
文摘As the speed gap between main memory and modern processors continues to widen, the cache behavior becomes more important for main memory database systems (MMDBs). Indexing technique is a key component of MMDBs. Unfortunately, the predominant indexes -B^+-trees and T-trees -- have been shown to utilize cache poorly, which triggers the development of many cache-conscious indexes, such as CSB^+-trees and pB^+-trees. Most of these cache-conscious indexes are variants of conventional B^+-trees, and have better cache performance than B^+-trees. In this paper, we develop a novel J^+-tree index, inspired by the Judy structure which is an associative array data structure, and propose a more cacheoptimized index -- Prefetching J^+-tree (pJ^+-tree), which applies prefetching to J^+-tree to accelerate range scan operations. The J^+-tree stores all the keys in its leaf nodes and keeps the reference values of leaf nodes in a Judy structure, which makes J^+-tree not only hold the advantages of Judy (such as fast single value search) but also outperform it in other aspects. For example, J^+-trees can achieve better performance on range queries than Judy. The pJ^+-tree index exploits prefetching techniques to further improve the cache behavior of J^+-trees and yields a speedup of 2.0 on range scans. Compared with B^+-trees, CSB^+-trees, pB^+-trees and T-trees, our extensive experimental Study shows that pJ^+-trees can provide better performance on both time (search, scan, update) and space aspects.
文摘We have witnessed exciting development of RAM technology in the past decade. The memory size grows rapidly and the price continues to decrease, so that it is fea- sible to deploy large amounts of RAM in a computer system. Several companies and research institutions have devoted a lot of resources to develop in-memory databases (IMDB) that implement queries after loading data into (virtual) memory in advance. The bloom of various in-memory databases pursues us to test and evaluate their performance objectively and fairly. Although the existing database benchmarks like Wisconsin benchmark and TPC-X series have achieved great success, they cannot suit for in-memory databases due to the lack of consideration of unique characteristics of an IMDB. In this study, we propose MemTest, a novel benchmark that concerns some major characteristics of an in-memory database. This benchmark constructs particular metrics, which cover processing time, compression ratio, minimal memory space and column strength of an in-memory database. We design a data model based on inter-bank transaction applications, and a data generator to support uniform and skew data distributions. The MemTest workload includes a set of queries and transactions against the metrics and data model. Finally, we illustrate the efficacy of MemTest through the implementations on two different in-memory databases.