摘要
在分析了数据偏斜特点的基础上,提出了一种抗静态和动态数据偏斜的HybridSkew算法以及代价分析模型.应用本模型对HybridSkew算法进行分析,结果表明,本算法在网络传输率和磁盘传输率较低的系统和半连接选择率较小、有偏斜的情况下。
On the basis of the analysis of the characteristic of data skew, a hybrid skew hash
join algorithm is presented, which can surmount the negative effects of static data skew and
motion data skew. In order to analysis the performance of this algorithm, a cost analysis model
on several kinds of parameters from a new point is given. We analysis the hybrid skew
algorithm using this model, point out that this algorithm fits for the system with relatively low
network bandwidth and/or disk transportation radio, and will gain the best performance when the
selective radio of semi join is relatively small and the data skew is relatively high.
出处
《华中理工大学学报》
CSCD
北大核心
1999年第4期34-36,共3页
Journal of Huazhong University of Science and Technology
基金
国家高技术研究发展计划资助
关键词
并行查询
并行二元连接
数据偏斜
数据库
parallel query
parallel binary join
hash algorithm
data skew