基于兴趣视图子集的流立方体计算方法被引量：2

A Method of Stream Cube Computing Based on Interesting View Subset

下载PDF

导出

摘要流立方体计算是流式数据多维分析的重要基础,然而流式数据的动态性、无限性、突发性等特征使其面临巨大的挑战.在实际应用中,用户的兴趣通常集中在部分视图上,基于这个特点提出了一种基于兴趣视图子集的计算方法,依据用户历史查询信息确定兴趣视图子集与兴趣路径,同时定义了Stream-Tree结构用于在主存中物化存储兴趣视图子集所包含的数据单元,在运行过程中依据多层次时间窗口约束不断更新和维护Stream-Tree中存储的数据单元,而对于稀疏数据单元仅保留高层次的聚集值.实验和分析表明,该方法能够在有限的主存空间中维持流立方体当前窗口内的数据单元,同时能够支持快速更新维护存储结构和响应用户查询. Stream cube computing is the important foundation of data stream multidimensional analysis. But the features of data stream （dynamic, infinity, bursty, etc） and complexity of multidimensional data structure, are confronted with great challenges, such as storage space, updating efficiency, adaptability, and so on. In many applications, users often focus on only a portion of views. A computing method based on interesting view subset is proposed in this paper. Interesting view subset and interesting path can be obtained by the information of historical queries. And if the efficiency of answering queries decreases, it should be updated with the lapse of time. The Stream-Tree structure is defined for maintaining the cells of interesting view subset and drilling paths in memory. In the running phase, the cells of Stream-Tree are continuously updated with new tuple arriving, and the old cells are deleted periodically according to the constraints of multi-level time windows. The sparse cells of Stream-Tree will not be divided into finer ones, only the high level aggregations are preserved. Experiments and analysis results indicate that the method is efficient in maintaining the stream cube cells of current time window in finite memory, and can answer the queries of users quickly.

作者侯东风张维明刘青宝邓苏

机构地区国防科学技术大学C

出处《计算机研究与发展》 EI CSCD 北大核心 2011年第12期2369-2378,共10页 Journal of Computer Research and Development

基金国家自然科学基金项目(70771110)

关键词流式数据流立方体多维分析兴趣视图子集多层次时间窗口 data stream stream cube multidimensional analysis interesting view subset multi-level time windows

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献18

1Gray J, Bosworth A, Layman A, et al. Data cube: A relational aggregation operator generalizing group-by, cross tab, and sub-totals [C] //Proc of the 12th Int Conf on Data Engineering. Los Alamitos, CA: IEEE Computer Society, 1996:152-159.
2Xin D, Han J, Li X, et al. Computing iceberg cubes by top- down and bottom-up integration: The StarCubing approach[J]. IEEE Trans on Knowledge and Data Engineering, 2007, 19(1): 111-126.
3向隆刚,龚健雅.一种高度浓缩和语义保持的数据立方[J].计算机研究与发展,2007,44(5):837-844. 被引量：5
4师智斌,黄厚宽.基于形式概念分析的约简数据立方体研究[J].计算机研究与发展,2009,46(11):1956-1962. 被引量：6
5Lakshmanan L V S, Pei J, Zhao Y. QCTrees: An efficient summary structure for semantic OLAP [C] //Proc of the 2003 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2003: 64-75.
6Furfaro F, Mazzeo G M, Sacca D, et al. Compressed hierarchical binary histograms for summarizing multi- dimensional data [J]. Knowledge and Information Systems,2007, 15(3): 335-380.
7Vitter J S, Wang M. Approximate computation of multidimensional aggregates of sparse data using wavelets [C] //Proc of the 1999 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 1999:193-204.
8Hsieh M J, Chen M S, Yu P S. Approximate query processing in cube streams [J]. IEEE Trans on Knowledge and Data Engineering, 2007, 19(11): 1557-1570.
9Cuzzocrea A, Wang Wei. Approximate range-sum query answering on data cubes with probabilistic guarantees [J]. Journal of Intelligent Information Systems, 2007, 28 (2): 161-197.
10Li X, Han J, Yin Z, et al. Sampling cube: A framework for statistical OLAP over sampling data [C] //Proc of the 2008 ACM SIGMOD Int CoM on Management of Data. New York: ACM, 2008: 779-790.

二级参考文献20

1曲开社,翟岩慧.偏序集、包含度与形式概念分析[J].计算机学报,2006,29(2):219-226. 被引量：52
2J Gray, A Bosworth, A Layman, et al. Data cube; A relational operator generalizing group-by, cross-tab, and sub-totals [C]. In: Proc of the 1996 Int'l Conf on Data Engineering. Los Alamitos, CA: IEEE Computer Society Press, 1996. 152-159.
3S Agrawal, R Agreal, P M Deshpande, et al. On the computation of multidimensional aggregates [CJ. In: Proc of the 1996 Int'l Conf on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1996. 506-521.
4K Ross, D Srivastava. Fast computation of sparse data cubes. In: Proc of the 1997 Int'l Conf on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1997. 116-125.
5Y Zhao, P deshpande, J F Naughton. An array-based algorithm for slmultanous multidimensional aggregates [C]. In: Proc of the 1997 ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 1997. 159-170.
6E Baralis, S Paraboschi, E Teniente. Materialized views selection in a multidimensional database [C]. In: Proc of the 1997 Int'l Conference on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1997. 156-165.
7K Beyer, R Ramakrishnan. Bottom-up computation of sparse and iceberg CUBEs [C]. In: Proc of the 1999 ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 1999. 359-370.
8M Fang, N Shivkumar, H Garcia-Molina, et al. Computing iceberg queries efficiently [C]. In: Proc of the 1998 Int'l Conf on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1998. 299-310.
9J Han, J Pei, G Dong, et al. Efficient computation of iceberg cubes with complex measures [C]. In: Proc of the 2001 ACM SIGMOD Int' 1 Conf on Management of Data. New York: ACM Press, 2001. 1-12.
10V Harinarayan, A Rajaraman, J D Ullman. Implementing data cubes efficiently [C]. In: Proc of the 1996 ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 1996. 205-216.

共引文献9

1侯东风,陆昌辉,刘青宝,张维明.数据立方体计算方法研究综述[J].计算机科学,2008,35(10):1-5. 被引量：6
2李铭,朱欣焰,李德仁.基于IPv6的地球空间信息服务网络平台的设计与实现[J].广西大学学报（自然科学版）,2011,36(A01):343-348.
3刘青宝,侯东风.基于查询索引树的多维连续查询计算方法[J].信息工程大学学报,2012,13(1):100-104. 被引量：1
4师智斌,王博.形式背景约简在数据立方体中的应用研究[J].计算机工程与应用,2012,48(15):24-28. 被引量：1
5师智斌,高献卫,刘忠宝.一种包含属性蕴含语义的数据立方体结构[J].小型微型计算机系统,2014,35(5):1005-1009. 被引量：2
6罗恩韬,王国军,李超良.大数据环境中多维数据去重的聚类算法研究[J].小型微型计算机系统,2016,37(3):438-442. 被引量：19
7王洋,游进国,张婷,张正凡.数据立方体格的图结构特性研究[J].计算机工程,2017,34(2):68-73. 被引量：5
8邓剑勋,熊忠阳,邓欣.一种新的基于Bloom filter数据结构的数据消冗算法[J].南昌大学学报（理科版）,2017,41(5):455-459.
9徐静文,游进国,王全鹍,黄星瑞,贾连印.数据立方体与频繁项集的统一计算框架研究[J].计算机学报,2023,46(4):780-802.

同被引文献14

1张成虎,赵小虎.基于决策树算法的洗钱交易识别研究[J].武汉理工大学学报,2008,30(2):154-156. 被引量：9
2夏芳,陈虹,曹立强,沈卫超.利用位图索引加速大规模科学数据按需访问[J].计算机研究与发展,2011,48(S1):94-99. 被引量：4
3杨胜刚,何靖.反洗钱领域大额与可疑信息报告制度的经济学分析[J].金融研究,2004(10):113-119. 被引量：38
4李时.基于模糊概念的可疑金融交易量化关联规则研究[J].当代经济科学,2007,29(2):57-60. 被引量：5
5杨胜刚,何靖,曾翼.反洗钱中监管机构和商业银行的博弈与委托代理问题研究[J].金融研究,2007(01A):71-83. 被引量：40
6唐旭,师永彦,曹作义.中国反洗钱工作有效性研究[J].金融研究,2009(8):1-16. 被引量：30
7栾华,杜小勇,王珊.缓存敏感的封闭冰山立方体计算[J].软件学报,2010,21(4):620-631. 被引量：4
8王斌,汤俊,陶也青.一种基于混沌预测的离群时间序列检测方法[J].武汉大学学报（工学版）,2010,43(2):265-268. 被引量：3
9孙景,张成虎,陈善新.基于时间序列孤立点检测的可疑外汇资金交易识别研究[J].统计与决策,2010,26(18):26-29. 被引量：4
10徐舒,李涵,甘犁.市场竞争与中国民航机票定价[J].经济学（季刊）,2011,10(1):635-652. 被引量：9

引证文献2

1尹为,张成虎,甘凯.基于数据流多维分析的可疑金融交易动态识别[J].北京理工大学学报（社会科学版）,2013,15(5):52-59. 被引量：3
2徐涛,钱帅,卢敏,左海超.一种改进的冰山立方体计算方法及其在机票结算数据中的应用[J].计算机应用研究,2018,35(6):1764-1767.

二级引证文献3

1刘峰,叶红.基于多维分析视角的信息网络研究[J].电子技术与软件工程,2015(6):21-22.
2徐泰华,张清华.基于启发式有向圈查询的可疑交易识别研究[J].南京大学学报（自然科学版）,2016,52(5):879-889.
3孙婧雯,赵倩倩.反洗钱研究发展脉络与未来展望——基于Citespace的分析[J].武汉金融,2022(6):83-88. 被引量：2

1侯东风,刘青宝,张维明,邓苏.一种适应性的流式数据聚集计算方法[J].计算机科学,2010,37(3):152-155. 被引量：6
2章文涛,吴玲琦.存储遥感影像的一种多层AVL tree结构[J].计算机系统应用,2011,20(8):188-190.
3徐坤,武友新,江恭和,李庆华.多路方体聚集完全立方体计算算法[J].计算机应用与软件,2012,29(9):104-106.
4韩成勇.基于数据缩减和存储过程的ID3算法改进设计[J].哈尔滨师范大学自然科学学报,2013,29(4):51-54.
5张明岽,张金莉,程伟.基于TrieTree结构的编辑软件智能提示功能的实现[J].电脑编程技巧与维护,2013(12):103-104.
6张朝鑫.大规模数据集聚类方法及其应用研究[J].电子世界,2014(14):313-313.
7刘亚波,刘大有,高滢,齐红.基于数据立方体的属性核计算方法[J].计算机工程,2008,34(20):46-48. 被引量：1
8侯东风,陆昌辉,刘青宝,张维明.流式数据多维模型[J].系统工程与电子技术,2009,31(8):2003-2007.
9遇辉,唐世渭,杨冬青,李囡.基于立方体计算的关键梯度分析[J].计算机科学,2005,32(9):96-99.
10史太齐,刘亮,秦小麟.DCST:主存空间高效的缓存敏感型T-树索引研究[J].计算机科学与探索,2017,11(2):221-230.

计算机研究与发展

2011年第12期

浏览历史

内容加载中请稍等...

基于兴趣视图子集的流立方体计算方法被引量：2

参考文献18

二级参考文献20

共引文献9

同被引文献14

引证文献2

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于兴趣视图子集的流立方体计算方法 被引量：2

参考文献18

二级参考文献20

共引文献9

同被引文献14

引证文献2

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于兴趣视图子集的流立方体计算方法被引量：2