FSC——利用频繁项集挖掘估算视图大小

FSC—Using Frequent Set Mining for View Size Estimation

下载PDF

导出

摘要 OLAP系统中经常要在大规模数据库上进行复杂查询为了提高查询响应速度 ,往往要事先物化一些视图在考虑选择物化哪些视图时 ,必须首先解决视图大小的估算问题目前 ,对于视图大小的估算 ,主要有两种方法 :一种是利用概率模型和数学估算的方法 ;另一种是假定数据符合某种特定的分布模型通过采样确定模型的参数 ,并将其推广到整个数据集进行估算提出了一种视图估算的新方法FSC ,引入了频繁项集挖掘的思想 ,在扫描两次数据库后可以得到cube中所有视图大小的估算值实验证明 ,与同类算法相比 ,FSC的精度有较大地提高。 On line analytical processing (OLAP) usually involves complex queries on very large database Pre aggregation is frequently used to speed up the query response time Storage estimation should be done in advance for selective pre aggregation The solutions of the problem boil down to two categories: one is based on probabilistic counting and mathematical approximation The other one based on a priori distribution model is to extrapolate the estimated parameters of distribution on sampling subset to the whole dataset A novel approach named FSC (frequent sets counting) is presented for view size estimation based on the frequent sets mining and can derive estimation of all views in a cube by two scans of database The results indicate that the proposed scheme approximates more accurately than other schemes, especially for high skewed dataset

作者邹远娅周皓峰王晨汪卫施伯乐

机构地区复旦大学计算机与信息技术系

出处《计算机研究与发展》 EI CSCD 北大核心 2004年第10期1670-1676,共7页 Journal of Computer Research and Development

基金国家自然科学基金重点项目 ( 6993 3 0 10 60 3 0 3 0 0 8) 国家"八六三"高技术研究发展计划基金项目 ( 2 0 0 2AA4Z3 43 0 2 0 0 2AA2 3 10 41)

关键词视图估算频繁项集均匀分布数据倾斜度 view size estimation, frequent set, uniform distribution, data skewness

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献10

1http://research microsoft com/-Gray/DBGen/DBgen zip .
2EBaralis,SParaboschi,ETeniente.Materializedviewselectioninamulti dimensionaldatabase[].ProcoftherdVLDBConfSanFrancisco.1997
3AShukla,PMDeshpande,JFNaughton.Materializedviewse lectionformulti dimensionaldatasets[].ProcofthethVLDBConf.1998
4HGupta.Selectionofviewstomaterializeinadatawarehouse[].ProcofthethICDT.1997
5AFCardenas.Analysisandperformanceofinverteddatabasestructures[].Communications of the ACM.1975
6GKZipfHuman.BehaviorandPrincipleofLeastEffort:AnIn troductiontoHumanEcology[]..1949
7PHass,JFNaughton,SSeshadri,etal.Sampling basedestima tionofthenumberofdistinctvaluesofanattribute[].ProcofthestVLDBConf.1995
8DBarbar,WDuMouchel,CFaloutsos,etal.TheNewJerseydatareductionreport[].IEEEDataEngineeringBulletin.1997
9KRunapongsa,HUchiyama,TNadeau.Analysisoftheperfor manceparameterinROLAP[]..1999
10KRunapongsa,TPNadeau,TJTeorey.StorageestimationformultidimensionalaggregatesinOLAP[].ProcofthethCAS CONConf.1999

1公羽.更改XP缩略视图大小[J].大众软件,2003(4):86-87.
2李珣,黄达.数学估算方法在小型无人机数学建模中的应用[J].中国科技信息,2009(22):157-157.
3精灵.更改Windows XP缩略视图大小[J].电脑迷,2005,0(10):75-75.
4肖志良.基于数据倾斜关联度的数据库高效挖掘方法[J].科技通报,2014,30(2):53-55. 被引量：2
5杨庚,章韵.关系数据库SQL语言查询过程分析和优化设计[J].计算机工程与应用,1999,35(11):87-89. 被引量：15
6罗瑛,张清胜,吴向东,宋新宇,朱可.基于Domino的办公自动化系统的优化方法[J].中国金融电脑,2008(4):64-66. 被引量：2
7陈受凯,刘雅正.B_Link 树结构的缓存机制在数据集成中的应用[J].微型机与应用,2012,31(1):1-3.
8张理强,郑兆瑞.SQL Server 2000 Analysis Services在税务信息系统中的应用[J].太原理工大学学报,2003,34(4):475-477. 被引量：1
9陈宇.改善数据库查询响应速度[J].电脑开发与应用,1998,11(4):58-59.
10我如何在全屏和窗口模式间快速切换Windows Virtual PC？[J].Windows IT Pro Magazine（国际中文版）,2010(1):71-71.

计算机研究与发展

2004年第10期

浏览历史

内容加载中请稍等...

FSC——利用频繁项集挖掘估算视图大小

参考文献10

相关作者

相关机构

相关主题

浏览历史