BERT is a representative pre-trained language model that has drawn extensive attention for significant improvements in downstream Natural Language Processing(NLP)tasks.The complex architecture and massive parameters b...BERT is a representative pre-trained language model that has drawn extensive attention for significant improvements in downstream Natural Language Processing(NLP)tasks.The complex architecture and massive parameters bring BERT competitive performance but also result in slow speed at model inference time.To speed up BERT inference,FastBERT realizes adaptive inference with an acceptable drop in accuracy based on knowledge distillation and the early-exit technique.However,many factors may limit the performance of FastBERT,such as the teacher classifier that is not knowledgeable enough,the batch size shrinkage and the redundant computation of student classifiers.To overcome these limitations,we propose a new BERT inference method with GPU-Efficient Exit Prediction(GEEP).GEEP leverages the shared exit loss to simplify the training process of FastBERT from two steps into only one step and makes the teacher classifier more knowledgeable by feeding diverse Transformer outputs to the teacher classifier.In addition,the exit layer prediction technique is proposed to utilize a GPU hash table to handle the token-level exit layer distribution and to sort test samples by predicted exit layers.In this way,GEEP can avoid batch size shrinkage and redundant computation of student classifiers.Experimental results on twelve public English and Chinese NLP datasets prove the effectiveness of the proposed approach.The source codes of GEEP will be released to the public upon paper acceptance.展开更多
On-line transaction processing(OLTP)systems rely on transaction logging and quorum-based consensus protocol to guarantee durability,high availability and strong consistency.This makes the log manager a key component o...On-line transaction processing(OLTP)systems rely on transaction logging and quorum-based consensus protocol to guarantee durability,high availability and strong consistency.This makes the log manager a key component of distributed database management systems(DDBMSs).The leader of DDBMSs commonly adopts a centralized logging method to writing log entries into a stable storage device and uses a constant log replication strategy to periodically synchronize its state to followers.With the advent of new hardware and high parallelism of transaction processing,the traditional centralized design of logging limits scalability,and the constant trigger condition of replication can not always maintain optimal performance under dynamic workloads.In this paper,we propose a new log manager named Salmo with scalable logging and adaptive replication for distributed database systems.The scalable logging eliminates centralized contention by utilizing a highly concurrent data structure and speedy log hole tracking.The kernel of adaptive replication is an adaptive log shipping method,which dynamically adjusts the number of log entries transmitted between leader and followers based on the real-time workload.We implemented and evaluated Salmo in the open-sourced transaction processing systems Cedar and DBx1000.Experimental results show that Salmo scales well by increasing the number of working threads,improves peak throughput by 1.56×and reduces latency by more than 4×over log replication of Raft,and maintains efficient and stable performance under dynamic workloads all the time.展开更多
Massive scale of transactions with critical requirements become popular for emerging businesses,especially in E-commerce.One of the most representative applications is the promotional event running on Alibaba's pl...Massive scale of transactions with critical requirements become popular for emerging businesses,especially in E-commerce.One of the most representative applications is the promotional event running on Alibaba's platform on some special dates,widely expected by global customers.Although we have achieved significant progress in improving the scalability of transactional database systems(OLTP),the presence of contention operations in workloads is still one of the fundamental obstacles to performance improving.The reason is that the overhead of managing conflict transactions with concurrency control mechanisms is proportional to the amount of contentions.As a consequence,generating contented workloads is urgent to evaluate performance of modern OLTP database systems.Though we have kinds of standard benchmarks which provide some ways in simulating contentions,e.g.,skew distribution control of transactions,they can not control the generation of contention quantitatively;even worse,the simulation effectiveness of these methods is affected by the scale of data.So in this paper we design a scalable quantitative contention generation method with fine contention granularity control.We conduct a comprehensive set of experiments on popular opensourced DBMSs compared with the latest contention simulation method to demonstrate the effectiveness of our generation work.展开更多
In analytical queries,a number of important operators like JOIN and GROUP BY are suitable for parallelization,and GPU is an ideal accelerator considering its power of parallel computing.However,when data size increase...In analytical queries,a number of important operators like JOIN and GROUP BY are suitable for parallelization,and GPU is an ideal accelerator considering its power of parallel computing.However,when data size increases to hundreds of gigabytes,one GPU card becomes insufficient due to the small capacity of global memory and the slow data transfer between host and device.A straightforward solution is to equip more GPUs linked with high-bandwidth connectors,but the cost will be highly increased.We utilize unified memory(UM)produced by NVIDIA CUDA(Compute Unified Device Architecture)to make it possible to accelerate large-scale queries on just one GPU,but we notice that the transfer performance between host and UM,which happens before kernel execution,is often significantly slower than the theoretical bandwidth.An important reason is that,in singleGPU environment,data processing systems usually invoke only one or a static number of threads for data copy,leading to an inefficient transfer which slows down the overall performance heavily.In this paper,we present D-Cubicle,a runtime module to accelerate data transfer between host-managed memory and unified memory.D-Cubicle boosts the actual transfer speed dynamically through a self-adaptive approach.In our experiments,taking data transfer into account,D-Cubicle processes 200 GB of data on a single GPU with 32 GB of global memory,achieving 1.43x averagely and 2.09x maximally the performance of the baseline system.展开更多
1 Introduction and main contributiions Emerging hardwares like remote Direct Memory Access(RDMA)capable networks and persistent memory(PM)are promising to build fast high availability in-memory key-value stores.The re...1 Introduction and main contributiions Emerging hardwares like remote Direct Memory Access(RDMA)capable networks and persistent memory(PM)are promising to build fast high availability in-memory key-value stores.The recent advent of Intel Optane DC Persistent Memory Modules(Optane DCPMM)brings the future closer.However,existing studies to combine the two devices cannot deliver the desired performance due to their two-phase protocols for log shipping and most of them were based on emulation that perform sub-optimally on real PM hardware.展开更多
Information on the Internet is fragmented and presented in different data sources, which makes automatic knowledge harvesting and understanding formidable for ma- chines, and even for humans. Knowledge graphs have be-...Information on the Internet is fragmented and presented in different data sources, which makes automatic knowledge harvesting and understanding formidable for ma- chines, and even for humans. Knowledge graphs have be- come prevalent in both of industry and academic circles these years, to be one of the most efficient and effective knowledge integration approaches. Techniques for knowledge graph construction can mine information from either structured, semi-structured, or even unstructured data sources, and fi- nally integrate the information into knowledge, represented in a graph. Furthermore, knowledge graph is able to organize information in an easy-to-maintain, easy-to-understand and easy-to-use manner. In this paper, we give a summarization of techniques for constructing knowledge graphs. We review the existing knowledge graph systems developed by both academia and industry. We discuss in detail about the process of building knowledge graphs, and survey state-of-the-art techniques for automatic knowledge graph checking and expansion via log- ical inferring and reasoning. We also review the issues of graph data management by introducing the knowledge data models and graph databases, especially from a NoSQL point of view. Finally, we overview current knowledge graph sys- tems and discuss the future research directions.展开更多
With the increasing number of GPS-equipped vehicles,more and more trajectories are generated continuously,based on which some urban applications become feasible,such as route planning.In general,popular route that has...With the increasing number of GPS-equipped vehicles,more and more trajectories are generated continuously,based on which some urban applications become feasible,such as route planning.In general,popular route that has been travelled frequently is a good choice,especially for people who are not familiar with the road networks.Moreover,accurate estimation of the travel cost(such as travel time,travel fee and fuel consumption)will benefit a wellscheduled trip plan.In this paper,we address this issue by finding the popular route with travel cost estimation.To this end,we design a system consists of three main components.First,we propose a novel structure,called popular traverse graph where each node is a popular location and each edge is a popular route between locations,to summarize historical trajectories without road network information.Second,we propose a self-adaptive method to model the travel cost on each popular route at different time interval,so that each time interval has a stable travel cost.Finally,based on the graph,given a query consists of source,destination and leaving time,we devise an efficient route planning algorithmwhich considers optimal route concatenation to search the popular route from source to destination at the leaving time with accurate travel cost estimation.Moreover,we conduct comprehensive experiments and implement our system by a mobile App,the results show that our method is both effective and efficient.展开更多
State machine replication has been widely used in modern cluster-based database systems.Most commonly deployed configurations adopt the Raft-like consensus protocol,which has a single strong leader which replicates th...State machine replication has been widely used in modern cluster-based database systems.Most commonly deployed configurations adopt the Raft-like consensus protocol,which has a single strong leader which replicates the log to other followers.Since the followers can handle read requests and many real workloads are usually read-intensive,the recovery speed of a crashed follower may significantly impact on the throughput.Different from traditional database recovery,the recovering follower needs to repair its local log first.Original Raft protocol takes many network round trips to do log comparison between leader and the crashed follower.To reduce network round trips,an optimization method is to truncate the follower’s uncertain log entries behind the latest local commit point,and then to directly fetch all committed log entries from the leader in one round trip.However,if the commit point is not persisted,the recovering follower has to get the whole log from the leader.In this paper,we propose an accurate and efficient log repair(AELR)algorithm for follower recovery.AELR is more robust and resilient to follower failure,and it only needs one network round trip to fetch the least number of log entries for follower recovery.This approach is implemented in the open source database system OceanBase.We experimentally show that the system adopting AELR has a good performance in terms of recovery time.展开更多
Log-structured merge tree has been adopted by many distributed storage systems. It decomposes a large database into multiple parts: an in?writing part and several read-only ones. Records are firstly written into a mem...Log-structured merge tree has been adopted by many distributed storage systems. It decomposes a large database into multiple parts: an in?writing part and several read-only ones. Records are firstly written into a memoryoptimized structure and then compacted into in-disk struc? tures periodically. It achieves high write throughput. However, it brings side effect that read requests have to go through multiple structures to find the required record. In a distributed database system, different parts of the LSM-tree are stored in distributed fashion. To this end, a server in the query layer has to issues multiple network communications to pull data items from the underlying storage layer. Coming to its rescue, this work proposes a precise data access strategy which includes: an efficient structure with low maintaining overhead designed to test whether a record exists in the in?writing part of the LSM-tree;a lease-based synchronization strategy proposed to maintain consistent copies of the structure on remote query servers. We further prove the technique is capable of working robustly when the LSM-Tree is re?organizing multiple structures in the backend. It is also fault-tolerant, which is able to recover the structures used in data access after node failures happen. Experiments using the YCSB benchmark show that the solution has 6x throughput improvement over existing methods.展开更多
Entity alignment is the problem of identifying which entities in a data source refer to the same real-world entity in the others.Identifying entities across heterogeneous data sources is paramount to many research fie...Entity alignment is the problem of identifying which entities in a data source refer to the same real-world entity in the others.Identifying entities across heterogeneous data sources is paramount to many research fields,such as data cleaning,data integration,.information retrieval and machine learning.The aligning process is not only overwhelmingly expensive for large data sources since it involves all tuples from two or more data sources,but also need to handle heterogeneous entity attributes.In this paper,we propose an unsupervised approach,called EnAli,to match entities across two or more heterogeneous data sources.EnAli employs a generative probabilistic model to incorporate the heterogeneous entity attributes via employing exponential family,handle missing values,and also utilize the locality sensitive hashing schema to reduce the candidate tuples and speed up the aligning process.EnAli is highly accurate and efficient even without any ground-truth tuples.We illustrate the performance of EnAli on re-identifying entities from the same data source,as well as aligning entities across three real data sources.Our experimental results manifest that our proposed approach outperforms the comparable baseline.展开更多
Data uncertainty widely exists in many web applications, financial applications and sensor networks. Ranking queries that return a number of tuples with maximal ranking scores are important in the field of database ma...Data uncertainty widely exists in many web applications, financial applications and sensor networks. Ranking queries that return a number of tuples with maximal ranking scores are important in the field of database management. Most existing work focuses on proposing static solutions for various ranking semantics over uncertain data. Our focus is to handle continuous ranking queries on uncertain data streams: testing each new tuple to output highly-ranked tuples. The main challenge comes from not only the fact that the possible world space will grow exponentially when new tuples arrive, but also the requirement for low space- and time- complexity to adapt to the streaming environments. This paper aims at handling continuous ranking queries on uncertain data streams. We first study how to handle this issue exactly, then we propose a novel method (exponential sampling) to estimate the expected rank of a tuple with high quality. Analysis in theory and detailed experimental reports evaluate the proposed methods.展开更多
Modern database systems desperate for the ability to support highly scalable transactions and efficient queries simultaneously for real-time applications.One solution is to utilize query optimization techniques on the...Modern database systems desperate for the ability to support highly scalable transactions and efficient queries simultaneously for real-time applications.One solution is to utilize query optimization techniques on the on-line transaction processing(OLTP)systems.The materialized view is considered as a panacea to decrease query latency.However,it also involves the significant cost of maintenance which trades away transaction performance.In this paper,we examine the design space and conclude several design features for the implementation of a view on a distributed log-structured merge-tree(LSMtree),which is a well-known structure for improving data write performance.As a result,we develop two incremental view maintenance(IVM)approaches on LSM-tree.One avoids join computation in view maintenance transactions.Another with two optimizations is proposed to decouple the view maintenance with the transaction process.Under the asynchronous update,we also provide consistency queries for views.Experiments on TPC-H benchmark show our methods achieve better performance than straightforward methods on different workloads.展开更多
Most entity ranking research aims to retrieve a ranked list of entities from a Web corpus given a user query. The rank order of entities is determined by the relevance between the query and contexts of entities. Howev...Most entity ranking research aims to retrieve a ranked list of entities from a Web corpus given a user query. The rank order of entities is determined by the relevance between the query and contexts of entities. However, entities can be ranked directly based on their relative importance in a document collection, independent of any queries. In this paper, we introduce an entity ranking algorithm named NERank+. Given a document collection, NERank+ first constructs a graph model called Topical Tripartite Graph, consisting of document, topic and entity nodes. We design separate ranking functions to compute the prior ranks of entities and topics, respectively. A meta-path constrained random walk algorithm is proposed to propagate prior entity and topic ranks based on the graph model. We evaluate NERank+ over real-life datasets and compare it with baselines. Experimental results illustrate the effectiveness of our approach.展开更多
Currently, mere are many onune review weo sites where consumers can freely write comments about different kinds of products and services. These comments are quite useful for other potential consumers. However, the num...Currently, mere are many onune review weo sites where consumers can freely write comments about different kinds of products and services. These comments are quite useful for other potential consumers. However, the number of online comments is often large and the number continues to grow as more and more consumers contribute. In addition, one comment may mention more than one product and con- tain opinions about different products, mentioning something good and something bad. However, they share only a single overall score, Therefore, it is not easy to know the quality of an individual product from these comments. This paper presents a novel approach to generate review summaries including scores and description snippets with re- spect to each individual product. From the large number of comments, we first extract the context (snippet) that includes a description of the products and choose those snippets that express consumer opinions on them. We then propose several methods to predict the rating (from 1 to 5 stars) of the snip- pets. Finally, we derive a generic framework for generating summaries from the snippets. We design a new snippet selec- tion algorithm to ensure that the returned results preserve the opinion-aspect statistical properties and attribute-aspect cov- erage based on a standard seat allocation algorithm. Through experiments we demonstrate empirically that our methods are effective. We also quantitatively evaluate each step of our ap- proach.展开更多
As service oriented architecture (SOA) matures, service consumption demand leads to an urgent requirement for service discovery. Unlike Web documents, services are intended to be executed to achieve objectives and/o...As service oriented architecture (SOA) matures, service consumption demand leads to an urgent requirement for service discovery. Unlike Web documents, services are intended to be executed to achieve objectives and/or desired goals of users. This leads to the notion that service discovery should take the "usage context" of service into account as well as service content (descriptions) which have been well explored. In this paper, we introduce the concept of service context which is used to represent service usage. In query processing, both service content and service context are ex- amined to identify services. We propose to represent ser- vice context by a weighted bipartite graph model. Based on the bipartite graph model, we reduce the gap between query space and service space by query expansion to improve re- call. We also design an iteration algorithm for result ranking by considering service contextsefulness as well as contentrelevance to improve precision. Finally, we develop a service search engine implementing this mechanism, and conduct some experiments to verify our idea.展开更多
The key issue in top-k retrieval, finding a set of k documents (from a large document collection) that can best answer a user's query, is to strike the optimal balance between relevance and diversity. In this paper...The key issue in top-k retrieval, finding a set of k documents (from a large document collection) that can best answer a user's query, is to strike the optimal balance between relevance and diversity. In this paper, we study the top-k re- trieval problem in the framework of facility location analysis and prove he submodularity of that objective function which provides a theoretical approximation guarantee of factor 1 -1/ε for the (best-first) greedy search algorithm. Furthermore, we propose a two-stage hybrid search strategy which first ob- tains a high-quality initial set of top-k documents via greedy search, and then refines that result set iteratively via local search. Experiments on two large TREC benchmark datasets show that our two-stage hybrid search strategy approach can supersede the existing ones effectively and efficiently.展开更多
As one kind of social media, microblogs are widely used for sensing the real-world. The popularity of mi- croblogs is an important measurement for evaluation of the influencial of pieces of information. The models and...As one kind of social media, microblogs are widely used for sensing the real-world. The popularity of mi- croblogs is an important measurement for evaluation of the influencial of pieces of information. The models and mod- eling techniques for popularity of microblogs are studied in this paper. A huge data set based on Sina Weibo, one of the most popular microblogging services, is used in the study. First, two different types of popularity, namely number of retweets and number of possible views are defined, while their relationships are discussed. Then, the temporal dynamics, in- cluding lifecycles and tipping-points, of tweets' popularity are studied. For modeling the temporal dynamics, a piece- wise sigmoid model is used. Empirical studies show the ef- fectiveness of our modeling methods.展开更多
The modern in-memory database(IMDB)can support highly concurrent on-line transaction processing(OLTP)workloads and generate massive transactional logs per second.Quorum-based replication protocols such as Paxos or Raf...The modern in-memory database(IMDB)can support highly concurrent on-line transaction processing(OLTP)workloads and generate massive transactional logs per second.Quorum-based replication protocols such as Paxos or Raft have been widely used in the distributed databases to offer higher availability and fault-tolerance.However,it is non-trivial to replicate IMDB because high transaction rate has brought new challenges.First,the leader node in quorum replication should have adaptivity by considering various transaction arrival rates and the processing capability of follower nodes.Second,followers are required to replay logs to catch up the state of the leader in the highly concurrent setting to reduce visibility gap.Third,modern databases are often built with a cluster of commodity machines connected by low configuration networks,in which the network anomalies often happen.In this case,the performance would be significantly affected because the follower node falls into the long-duration exception handling process(e.g.,fetch lost logs from the leader).To this end,we build QuorumX,an efficient and stable quorum-based replication framework for IMDB under heavy OLTP workloads.QuorumX combines critical path based batching and pipeline batching to provide an adaptive log propagation scheme to obtain a stable and high performance at various settings.Further,we propose a safe and coordination-free log replay scheme to minimize the visibility gap between the leader and follower IMDBs.We further carefully design the process for the follower node in order to alleviate the influence of the unreliable network on the replication performance.Our evaluation results with the YCSB,TPC-C and a realistic microbenchmark demonstrate that QuorumX achieves the performance close to asynchronous primary-backup replication and could always provide a stable service with data consistency and a low-level visibility gap.展开更多
Recently, big trajectory data streams are generated in distributed environments with the popularity of smartphones and other mobile devices. Distributed top?k similarity query, which finds k trajectories that are most...Recently, big trajectory data streams are generated in distributed environments with the popularity of smartphones and other mobile devices. Distributed top?k similarity query, which finds k trajectories that are most similar to a given query trajectory from all remote sites, is critical in this field. The key challenge in such a query is how to reduce the communication cost due to the limited network bandwidth resource. Although this query can be solved by sending the query trajectory to all the remote sites, in which the pairwise similarities are computed precisely. However, the overall cost, O(n·m),is huge when nor mis huge, where n is the size of query trajectory and m is the number of remote sites. Fortunately, there are some cheap ways to estimate pairwise similarity, which filter some trajectories in advance without precise computation. In order to overcome the challenge in this query, we devise two general frameworks, into which concrete distance measures can be plugged. The former one uses two bounds (the upper and lower bound), while the latter one only uses the lower bound. Moreover, we introduce detailed implementations of two representative distance measures, Euclidean and DTW distance, after inferring the lower and upper bound for the former framework and the lower bound for the latter one. Theoretical analysis and extensive experiments on real-world datasets evaluate the efficiency of proposed methods.展开更多
A social stream refers to the data stream that records a series of social entities and the dynamic interac-tions between two entities. It can be employed to model the changes of entity states in numerous applications....A social stream refers to the data stream that records a series of social entities and the dynamic interac-tions between two entities. It can be employed to model the changes of entity states in numerous applications. The social streams, the combination of graph and streaming data, pose great challenge to efficient analytical query processing, and are key to better understanding users' behavior. Considering of privacy and other related issues, a social stream genera-tor is of great significance. A framework of synthetic social stream generator (SSG) is proposed in this paper. The gener-ated social streams using SSG can be tuned to capture sev-eral kinds of fundamental social stream properties, includ-ing patterns about users' behavior and graph patterns. Ex-tensive empirical studies with several real-life social stream data sets show that SSG can produce data that better fit to real data. It is also confirmed that SSG can generate social stream data continuously with stable throughput and memory consumption. Furthermore, we propose a parallel implemen-tation of SSG with the help of asynchronized parallel pro-cessing model and delayed update strategy. Our experiments verify that the throughput of the parallel implementation can increase linearly by increasing nodes.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.U1911203,61877018,61977025,62202170)Alibaba Group through the Alibaba Innovation Research Program.
文摘BERT is a representative pre-trained language model that has drawn extensive attention for significant improvements in downstream Natural Language Processing(NLP)tasks.The complex architecture and massive parameters bring BERT competitive performance but also result in slow speed at model inference time.To speed up BERT inference,FastBERT realizes adaptive inference with an acceptable drop in accuracy based on knowledge distillation and the early-exit technique.However,many factors may limit the performance of FastBERT,such as the teacher classifier that is not knowledgeable enough,the batch size shrinkage and the redundant computation of student classifiers.To overcome these limitations,we propose a new BERT inference method with GPU-Efficient Exit Prediction(GEEP).GEEP leverages the shared exit loss to simplify the training process of FastBERT from two steps into only one step and makes the teacher classifier more knowledgeable by feeding diverse Transformer outputs to the teacher classifier.In addition,the exit layer prediction technique is proposed to utilize a GPU hash table to handle the token-level exit layer distribution and to sort test samples by predicted exit layers.In this way,GEEP can avoid batch size shrinkage and redundant computation of student classifiers.Experimental results on twelve public English and Chinese NLP datasets prove the effectiveness of the proposed approach.The source codes of GEEP will be released to the public upon paper acceptance.
基金supported by the National Natural Science Foundation of China(Grant Nos.62002119,61977026,62072180,and 61772202)supported by the Fundamental Research Funds for the Central Universities,Southwest Minzu University(2021PTJS23)supported by the Open Fund of Shanghai Engineering Research Center on Big Data Management System.
文摘On-line transaction processing(OLTP)systems rely on transaction logging and quorum-based consensus protocol to guarantee durability,high availability and strong consistency.This makes the log manager a key component of distributed database management systems(DDBMSs).The leader of DDBMSs commonly adopts a centralized logging method to writing log entries into a stable storage device and uses a constant log replication strategy to periodically synchronize its state to followers.With the advent of new hardware and high parallelism of transaction processing,the traditional centralized design of logging limits scalability,and the constant trigger condition of replication can not always maintain optimal performance under dynamic workloads.In this paper,we propose a new log manager named Salmo with scalable logging and adaptive replication for distributed database systems.The scalable logging eliminates centralized contention by utilizing a highly concurrent data structure and speedy log hole tracking.The kernel of adaptive replication is an adaptive log shipping method,which dynamically adjusts the number of log entries transmitted between leader and followers based on the real-time workload.We implemented and evaluated Salmo in the open-sourced transaction processing systems Cedar and DBx1000.Experimental results show that Salmo scales well by increasing the number of working threads,improves peak throughput by 1.56×and reduces latency by more than 4×over log replication of Raft,and maintains efficient and stable performance under dynamic workloads all the time.
基金supported by the National Natural Science Foundation of China(Grant No.62072179)ECNUOceanBase Joint Lab of Distributed Database System and 2020 the Key Software Adaptation and Verification Project(Database).
文摘Massive scale of transactions with critical requirements become popular for emerging businesses,especially in E-commerce.One of the most representative applications is the promotional event running on Alibaba's platform on some special dates,widely expected by global customers.Although we have achieved significant progress in improving the scalability of transactional database systems(OLTP),the presence of contention operations in workloads is still one of the fundamental obstacles to performance improving.The reason is that the overhead of managing conflict transactions with concurrency control mechanisms is proportional to the amount of contentions.As a consequence,generating contented workloads is urgent to evaluate performance of modern OLTP database systems.Though we have kinds of standard benchmarks which provide some ways in simulating contentions,e.g.,skew distribution control of transactions,they can not control the generation of contention quantitatively;even worse,the simulation effectiveness of these methods is affected by the scale of data.So in this paper we design a scalable quantitative contention generation method with fine contention granularity control.We conduct a comprehensive set of experiments on popular opensourced DBMSs compared with the latest contention simulation method to demonstrate the effectiveness of our generation work.
基金supported by the National Natural Science Foundation of China(Grant Nos.61732014 and 62141214)the National Key Research and Development Programof China(2018YFB1003400).
文摘In analytical queries,a number of important operators like JOIN and GROUP BY are suitable for parallelization,and GPU is an ideal accelerator considering its power of parallel computing.However,when data size increases to hundreds of gigabytes,one GPU card becomes insufficient due to the small capacity of global memory and the slow data transfer between host and device.A straightforward solution is to equip more GPUs linked with high-bandwidth connectors,but the cost will be highly increased.We utilize unified memory(UM)produced by NVIDIA CUDA(Compute Unified Device Architecture)to make it possible to accelerate large-scale queries on just one GPU,but we notice that the transfer performance between host and UM,which happens before kernel execution,is often significantly slower than the theoretical bandwidth.An important reason is that,in singleGPU environment,data processing systems usually invoke only one or a static number of threads for data copy,leading to an inefficient transfer which slows down the overall performance heavily.In this paper,we present D-Cubicle,a runtime module to accelerate data transfer between host-managed memory and unified memory.D-Cubicle boosts the actual transfer speed dynamically through a self-adaptive approach.In our experiments,taking data transfer into account,D-Cubicle processes 200 GB of data on a single GPU with 32 GB of global memory,achieving 1.43x averagely and 2.09x maximally the performance of the baseline system.
文摘1 Introduction and main contributiions Emerging hardwares like remote Direct Memory Access(RDMA)capable networks and persistent memory(PM)are promising to build fast high availability in-memory key-value stores.The recent advent of Intel Optane DC Persistent Memory Modules(Optane DCPMM)brings the future closer.However,existing studies to combine the two devices cannot deliver the desired performance due to their two-phase protocols for log shipping and most of them were based on emulation that perform sub-optimally on real PM hardware.
文摘Information on the Internet is fragmented and presented in different data sources, which makes automatic knowledge harvesting and understanding formidable for ma- chines, and even for humans. Knowledge graphs have be- come prevalent in both of industry and academic circles these years, to be one of the most efficient and effective knowledge integration approaches. Techniques for knowledge graph construction can mine information from either structured, semi-structured, or even unstructured data sources, and fi- nally integrate the information into knowledge, represented in a graph. Furthermore, knowledge graph is able to organize information in an easy-to-maintain, easy-to-understand and easy-to-use manner. In this paper, we give a summarization of techniques for constructing knowledge graphs. We review the existing knowledge graph systems developed by both academia and industry. We discuss in detail about the process of building knowledge graphs, and survey state-of-the-art techniques for automatic knowledge graph checking and expansion via log- ical inferring and reasoning. We also review the issues of graph data management by introducing the knowledge data models and graph databases, especially from a NoSQL point of view. Finally, we overview current knowledge graph sys- tems and discuss the future research directions.
文摘With the increasing number of GPS-equipped vehicles,more and more trajectories are generated continuously,based on which some urban applications become feasible,such as route planning.In general,popular route that has been travelled frequently is a good choice,especially for people who are not familiar with the road networks.Moreover,accurate estimation of the travel cost(such as travel time,travel fee and fuel consumption)will benefit a wellscheduled trip plan.In this paper,we address this issue by finding the popular route with travel cost estimation.To this end,we design a system consists of three main components.First,we propose a novel structure,called popular traverse graph where each node is a popular location and each edge is a popular route between locations,to summarize historical trajectories without road network information.Second,we propose a self-adaptive method to model the travel cost on each popular route at different time interval,so that each time interval has a stable travel cost.Finally,based on the graph,given a query consists of source,destination and leaving time,we devise an efficient route planning algorithmwhich considers optimal route concatenation to search the popular route from source to destination at the leaving time with accurate travel cost estimation.Moreover,we conduct comprehensive experiments and implement our system by a mobile App,the results show that our method is both effective and efficient.
基金This research was supported in part by National Key R&D Program of China(2018YFB1003303)the National Natural Science Foundation of China(Grant Nos.61432006,61732014 and 61972149).
文摘State machine replication has been widely used in modern cluster-based database systems.Most commonly deployed configurations adopt the Raft-like consensus protocol,which has a single strong leader which replicates the log to other followers.Since the followers can handle read requests and many real workloads are usually read-intensive,the recovery speed of a crashed follower may significantly impact on the throughput.Different from traditional database recovery,the recovering follower needs to repair its local log first.Original Raft protocol takes many network round trips to do log comparison between leader and the crashed follower.To reduce network round trips,an optimization method is to truncate the follower’s uncertain log entries behind the latest local commit point,and then to directly fetch all committed log entries from the leader in one round trip.However,if the commit point is not persisted,the recovering follower has to get the whole log from the leader.In this paper,we propose an accurate and efficient log repair(AELR)algorithm for follower recovery.AELR is more robust and resilient to follower failure,and it only needs one network round trip to fetch the least number of log entries for follower recovery.This approach is implemented in the open source database system OceanBase.We experimentally show that the system adopting AELR has a good performance in terms of recovery time.
基金National Hightech R&D Program (2015AA015307)the National Natural Science Foundation of China (Grant Nos. 61702189, 61432006 and 61672232)Youth Science and Technology -“Yang Fan” Program of Shanghai (17YF1427800).
文摘Log-structured merge tree has been adopted by many distributed storage systems. It decomposes a large database into multiple parts: an in?writing part and several read-only ones. Records are firstly written into a memoryoptimized structure and then compacted into in-disk struc? tures periodically. It achieves high write throughput. However, it brings side effect that read requests have to go through multiple structures to find the required record. In a distributed database system, different parts of the LSM-tree are stored in distributed fashion. To this end, a server in the query layer has to issues multiple network communications to pull data items from the underlying storage layer. Coming to its rescue, this work proposes a precise data access strategy which includes: an efficient structure with low maintaining overhead designed to test whether a record exists in the in?writing part of the LSM-tree;a lease-based synchronization strategy proposed to maintain consistent copies of the structure on remote query servers. We further prove the technique is capable of working robustly when the LSM-Tree is re?organizing multiple structures in the backend. It is also fault-tolerant, which is able to recover the structures used in data access after node failures happen. Experiments using the YCSB benchmark show that the solution has 6x throughput improvement over existing methods.
基金the National Key Research and Development Program of China (2016YFB1000905)the National Natural Science Foundation of China (Grant Nos.U1401256, 61402177,61672234,61402180 and 61232002)NSF of Shanghai (14ZR1412600).
文摘Entity alignment is the problem of identifying which entities in a data source refer to the same real-world entity in the others.Identifying entities across heterogeneous data sources is paramount to many research fields,such as data cleaning,data integration,.information retrieval and machine learning.The aligning process is not only overwhelmingly expensive for large data sources since it involves all tuples from two or more data sources,but also need to handle heterogeneous entity attributes.In this paper,we propose an unsupervised approach,called EnAli,to match entities across two or more heterogeneous data sources.EnAli employs a generative probabilistic model to incorporate the heterogeneous entity attributes via employing exponential family,handle missing values,and also utilize the locality sensitive hashing schema to reduce the candidate tuples and speed up the aligning process.EnAli is highly accurate and efficient even without any ground-truth tuples.We illustrate the performance of EnAli on re-identifying entities from the same data source,as well as aligning entities across three real data sources.Our experimental results manifest that our proposed approach outperforms the comparable baseline.
文摘Data uncertainty widely exists in many web applications, financial applications and sensor networks. Ranking queries that return a number of tuples with maximal ranking scores are important in the field of database management. Most existing work focuses on proposing static solutions for various ranking semantics over uncertain data. Our focus is to handle continuous ranking queries on uncertain data streams: testing each new tuple to output highly-ranked tuples. The main challenge comes from not only the fact that the possible world space will grow exponentially when new tuples arrive, but also the requirement for low space- and time- complexity to adapt to the streaming environments. This paper aims at handling continuous ranking queries on uncertain data streams. We first study how to handle this issue exactly, then we propose a novel method (exponential sampling) to estimate the expected rank of a tuple with high quality. Analysis in theory and detailed experimental reports evaluate the proposed methods.
基金This work was partially supported by Youth Foundation of National Science Foundation(61702189)National Science Foundation(61772202).
文摘Modern database systems desperate for the ability to support highly scalable transactions and efficient queries simultaneously for real-time applications.One solution is to utilize query optimization techniques on the on-line transaction processing(OLTP)systems.The materialized view is considered as a panacea to decrease query latency.However,it also involves the significant cost of maintenance which trades away transaction performance.In this paper,we examine the design space and conclude several design features for the implementation of a view on a distributed log-structured merge-tree(LSMtree),which is a well-known structure for improving data write performance.As a result,we develop two incremental view maintenance(IVM)approaches on LSM-tree.One avoids join computation in view maintenance transactions.Another with two optimizations is proposed to decouple the view maintenance with the transaction process.Under the asynchronous update,we also provide consistency queries for views.Experiments on TPC-H benchmark show our methods achieve better performance than straightforward methods on different workloads.
文摘Most entity ranking research aims to retrieve a ranked list of entities from a Web corpus given a user query. The rank order of entities is determined by the relevance between the query and contexts of entities. However, entities can be ranked directly based on their relative importance in a document collection, independent of any queries. In this paper, we introduce an entity ranking algorithm named NERank+. Given a document collection, NERank+ first constructs a graph model called Topical Tripartite Graph, consisting of document, topic and entity nodes. We design separate ranking functions to compute the prior ranks of entities and topics, respectively. A meta-path constrained random walk algorithm is proposed to propagate prior entity and topic ranks based on the graph model. We evaluate NERank+ over real-life datasets and compare it with baselines. Experimental results illustrate the effectiveness of our approach.
基金This work was partially supported by the National Science Foundation of China (Grant Nos. 61103039, 61232002, 61472345), National Basic Research Program of China (2010CB731402) and Wuhan Key Lab Research Foundation (SKLSE2012-09-16).
文摘Currently, mere are many onune review weo sites where consumers can freely write comments about different kinds of products and services. These comments are quite useful for other potential consumers. However, the number of online comments is often large and the number continues to grow as more and more consumers contribute. In addition, one comment may mention more than one product and con- tain opinions about different products, mentioning something good and something bad. However, they share only a single overall score, Therefore, it is not easy to know the quality of an individual product from these comments. This paper presents a novel approach to generate review summaries including scores and description snippets with re- spect to each individual product. From the large number of comments, we first extract the context (snippet) that includes a description of the products and choose those snippets that express consumer opinions on them. We then propose several methods to predict the rating (from 1 to 5 stars) of the snip- pets. Finally, we derive a generic framework for generating summaries from the snippets. We design a new snippet selec- tion algorithm to ensure that the returned results preserve the opinion-aspect statistical properties and attribute-aspect cov- erage based on a standard seat allocation algorithm. Through experiments we demonstrate empirically that our methods are effective. We also quantitatively evaluate each step of our ap- proach.
文摘As service oriented architecture (SOA) matures, service consumption demand leads to an urgent requirement for service discovery. Unlike Web documents, services are intended to be executed to achieve objectives and/or desired goals of users. This leads to the notion that service discovery should take the "usage context" of service into account as well as service content (descriptions) which have been well explored. In this paper, we introduce the concept of service context which is used to represent service usage. In query processing, both service content and service context are ex- amined to identify services. We propose to represent ser- vice context by a weighted bipartite graph model. Based on the bipartite graph model, we reduce the gap between query space and service space by query expansion to improve re- call. We also design an iteration algorithm for result ranking by considering service contextsefulness as well as contentrelevance to improve precision. Finally, we develop a service search engine implementing this mechanism, and conduct some experiments to verify our idea.
基金This work was supported by the National Natural Science Foundation of China (Grant Nos. 61572135 and 61170085), 973 project (2010CB328106), Program for New Century Excellent Talents in China (NCET-10-0388).
文摘The key issue in top-k retrieval, finding a set of k documents (from a large document collection) that can best answer a user's query, is to strike the optimal balance between relevance and diversity. In this paper, we study the top-k re- trieval problem in the framework of facility location analysis and prove he submodularity of that objective function which provides a theoretical approximation guarantee of factor 1 -1/ε for the (best-first) greedy search algorithm. Furthermore, we propose a two-stage hybrid search strategy which first ob- tains a high-quality initial set of top-k documents via greedy search, and then refines that result set iteratively via local search. Experiments on two large TREC benchmark datasets show that our two-stage hybrid search strategy approach can supersede the existing ones effectively and efficiently.
基金This work was partially supported by the NationalNatural Science Foundation of China (Grant Nos. 60925008, 61170086, and 61021004), National Basic Research (973 Program) (2010CB731402), and National High-tech R&D Program (863 Program) (2012AA011003).
文摘As one kind of social media, microblogs are widely used for sensing the real-world. The popularity of mi- croblogs is an important measurement for evaluation of the influencial of pieces of information. The models and mod- eling techniques for popularity of microblogs are studied in this paper. A huge data set based on Sina Weibo, one of the most popular microblogging services, is used in the study. First, two different types of popularity, namely number of retweets and number of possible views are defined, while their relationships are discussed. Then, the temporal dynamics, in- cluding lifecycles and tipping-points, of tweets' popularity are studied. For modeling the temporal dynamics, a piece- wise sigmoid model is used. Empirical studies show the ef- fectiveness of our modeling methods.
基金This work was partially supported by National Key R&D Program of China(2018YFB1003404)NSFC(Grant Nos.61972149,61977026)ECNU Academic Innovation Promotion Program for Excellent Doctoral Students.
文摘The modern in-memory database(IMDB)can support highly concurrent on-line transaction processing(OLTP)workloads and generate massive transactional logs per second.Quorum-based replication protocols such as Paxos or Raft have been widely used in the distributed databases to offer higher availability and fault-tolerance.However,it is non-trivial to replicate IMDB because high transaction rate has brought new challenges.First,the leader node in quorum replication should have adaptivity by considering various transaction arrival rates and the processing capability of follower nodes.Second,followers are required to replay logs to catch up the state of the leader in the highly concurrent setting to reduce visibility gap.Third,modern databases are often built with a cluster of commodity machines connected by low configuration networks,in which the network anomalies often happen.In this case,the performance would be significantly affected because the follower node falls into the long-duration exception handling process(e.g.,fetch lost logs from the leader).To this end,we build QuorumX,an efficient and stable quorum-based replication framework for IMDB under heavy OLTP workloads.QuorumX combines critical path based batching and pipeline batching to provide an adaptive log propagation scheme to obtain a stable and high performance at various settings.Further,we propose a safe and coordination-free log replay scheme to minimize the visibility gap between the leader and follower IMDBs.We further carefully design the process for the follower node in order to alleviate the influence of the unreliable network on the replication performance.Our evaluation results with the YCSB,TPC-C and a realistic microbenchmark demonstrate that QuorumX achieves the performance close to asynchronous primary-backup replication and could always provide a stable service with data consistency and a low-level visibility gap.
文摘Recently, big trajectory data streams are generated in distributed environments with the popularity of smartphones and other mobile devices. Distributed top?k similarity query, which finds k trajectories that are most similar to a given query trajectory from all remote sites, is critical in this field. The key challenge in such a query is how to reduce the communication cost due to the limited network bandwidth resource. Although this query can be solved by sending the query trajectory to all the remote sites, in which the pairwise similarities are computed precisely. However, the overall cost, O(n·m),is huge when nor mis huge, where n is the size of query trajectory and m is the number of remote sites. Fortunately, there are some cheap ways to estimate pairwise similarity, which filter some trajectories in advance without precise computation. In order to overcome the challenge in this query, we devise two general frameworks, into which concrete distance measures can be plugged. The former one uses two bounds (the upper and lower bound), while the latter one only uses the lower bound. Moreover, we introduce detailed implementations of two representative distance measures, Euclidean and DTW distance, after inferring the lower and upper bound for the former framework and the lower bound for the latter one. Theoretical analysis and extensive experiments on real-world datasets evaluate the efficiency of proposed methods.
文摘A social stream refers to the data stream that records a series of social entities and the dynamic interac-tions between two entities. It can be employed to model the changes of entity states in numerous applications. The social streams, the combination of graph and streaming data, pose great challenge to efficient analytical query processing, and are key to better understanding users' behavior. Considering of privacy and other related issues, a social stream genera-tor is of great significance. A framework of synthetic social stream generator (SSG) is proposed in this paper. The gener-ated social streams using SSG can be tuned to capture sev-eral kinds of fundamental social stream properties, includ-ing patterns about users' behavior and graph patterns. Ex-tensive empirical studies with several real-life social stream data sets show that SSG can produce data that better fit to real data. It is also confirmed that SSG can generate social stream data continuously with stable throughput and memory consumption. Furthermore, we propose a parallel implemen-tation of SSG with the help of asynchronized parallel pro-cessing model and delayed update strategy. Our experiments verify that the throughput of the parallel implementation can increase linearly by increasing nodes.