期刊文献+

BTreeU-Topk:基于二叉树的不确定数据上的Top-k查询算法 被引量:2

BTreeU-Topk : Binary-Tree Based Top-k Query Algorithms on Uncertain Data
下载PDF
导出
摘要 应用需求的发展衍生各种查询类型,Top-k查询是交互环境下一种重要查询类型.由于数据的不确定性,传统数据上的Top-k查询技术和方法不能直接应用于不确定数据查询.在已有不确定数据上Top-k查询算法的基础上,提出基于二叉树的不确定数据上Top-k查询算法BTreeU-Topk;为了提高算法执行效率,对二叉树进行修剪操作进而提出BTreeOPTU-Topk和BTreePU-Topk算法.实验结果表明,BTreeU-Topk,BTreeOPTU-Topk以及BTreePU-Topk算法在不同数据分布以及k值增长时均优于现有算法. Development of applications in uncertain data management area derives various kinds of queries. Among these, Top-k query has attracted considerable research attention which is an important query type and returns the most important k objects based on some characteristic values. Because of the uncertainty of application data, traditional Top-k methods and techniques could not be directly applied to query on uncertain data. In this paper, a new Top-k algorithm based on binary-tree named BTreeU-Topk is provided under existing Top-k algorithms on uncertain data. BTreeU-Topk algorithm utilizes binary-tree to simply representation of possible worlds space. To further improve the efficiency of the proposed algorithm, pruning technique is adopted and the BTreeOPTU-Topk and BTreePlJ-Topk algorithms are proposed. BTreeOPTU-Topk algorithm adopts optimized queue to prune branches not in final results while BTreePU-Topk algorithm utilizes pruning rule. Experimental results show that the proposed BTreeU-Topk, BTreeOPTU-Topk and BTreePU-Topk algorithms are superior to existing algorithms in two aspects- different data distributions and increase of k values. In addition, the quantity of maintained states is reduced by our proposed algorithms.
出处 《计算机研究与发展》 EI CSCD 北大核心 2012年第10期2095-2105,共11页 Journal of Computer Research and Development
基金 高等学校博士学科点专项科研基金项目(20103218110017) 江苏高校优势学科建设工程资助项目 南京航空航天大学青年科技创新基金项目(NS2010116 NN2012102)
关键词 不确定数据 可能世界语义 二叉树 Top—k BTreeU—Topk U—Topk uncertain data possible worlds semantics Binary-Tree Top-k BTreeU-Topk U-Topk
  • 相关文献

参考文献26

  • 1周傲英,金澈清,王国仁,李建中.不确定性数据管理技术研究综述[J].计算机学报,2009,32(1):1-16. 被引量:185
  • 2Sarma A D, Benjelloun O, Halevy A, et al. Working models for uncertain data [C] //Proc of the 22nd IEEE Int Conf on Data Engineering (ICDE). Los Alamitos, CA: IEEE Computer Society, 2006.
  • 3Tao Yufei, Cheng R, Xiao Xiaokui, et al. Indexing multi dimensional uncertain data with arbitrary probability density functions [C] //Proc of the 31st Int Conf on Very Large Data Bases (VLDB). New York: ACM, 2005:922-933.
  • 4Kriegel H P, Kunath P, Pfeifle M, et al. Probabilistic Similarity join on uncertain data [C] //Proc of the Int Conf on Database Systems for Advanced Applications (DASFAA). Berlin:Springer, 2006: 295-809.
  • 5Kriegel H P, Kunath P, Renz M neighbor query on uncertain objects Int Conf on Database Systems for (DASFAA). Berlin: Springer, 2007 Probabilistic nearest-[C] //Proc of the 12th Advanced Applications :337-348.
  • 6Pei Jian, Jiang Bin, Lin Xuemin, et al. Probalilistic skylines on uncertain data [C] //Proc of the 33rd Int Conf on Very Large Data Base (VLDB). New York: ACM, 2007:15-26.
  • 7Soliman M A, llyas I F. Top-k Query processing in uncertain databases [C] //Proe of the 23rd Int Conf on Data Engineering (ICDE). Los Alamitos, CA: IEEE Computer Society, 2007:896-905.
  • 8Soliman M A, Ilyas I F, Chang K C. Probabilistic top k and ranking-aggregate queries [J]. ACM Trans on Database Systems (TODS), 2008, 33(3): 1-54.
  • 9Ilyas I F, Beskales G, Soliman M A. A survey of top-k query processing techniques in relational database systems [J]. ACM Computing Surveys (CSUR), 2008, 40(4): 1-58.
  • 10Hua Ming, Pei Jian, Zhang Wenjie, et al. Ranking queries on uncertain data: A probabilistic threshold approach [C] // Proc of the 2008 ACM SIGMOD Int Conf on Management of data (SIGMOD). New York: ACM, 2008:673-686.

二级参考文献98

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2谷峪,于戈,张天成.RFID复杂事件处理技术[J].计算机科学与探索,2007,1(3):255-267. 被引量:54
  • 3Deshpande A, Guestrin C, Madden S, Hellerstein J M, Hong W. Model-driven data acquisition in sensor networks// Proceedings of the 30th International Conference on Very Large Data Bases. Toronto, 2004:588-599
  • 4Madhavan J, Cohen S, Xin D, Halevy A, Jeffery S, Ko D, Yu C. Web-scale data integration: You can afford to pay as you go//Proceedings of the 33rd Biennial Conference on Innovative Data Systems Research. Asilomar, 2007:342-350
  • 5Liu Ling. From data privacy to location privacy: Models and algorithms (tutorial)//Proceedings of the 33rd International Conference on Very Large Data bases. Vienna, 2007: 1429- 1430
  • 6Samarati P, Sweeney L. Generalizing data to provide anonymity when disclosing information (abstract)//Proeeedings of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. Seattle, 1998:188
  • 7Cavallo R, Pittarelli M. The theory of probabilistic databases//Proceedings of the 13th International Conference on Very Large Data Bases. Brighton, 1987:71-81
  • 8Barbara D, Garcia-Molina H, Porter D. The management of probabilistic data. IEEE Transactions on Knowledge and Data Engineering, 1992, 4(5): 487-502
  • 9Fuhr N, Rolleke T. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Transactions on Information Systems, 1997, 15(1): 32-66
  • 10Zimanyi E. Query evaluation in probabilistic databases. Theoretical Computer Science, 1997, 171(1-2): 179-219

共引文献184

同被引文献30

  • 1Chang C, Acharya A, Sussman A, et al. T2: A customizable parallel database for multi-dimensional data[ A ]. SIGMOD [ C ]. [ s. l. ], USA: ACM Press, 1998:221-232.
  • 2Marathe A P, Salem K. Query processing techniques for arrays[J]. The VLDB Journal,2002,11 ( 1 ) :68-91.
  • 3SciDB Community Forum. SciDB: The computational DBMS for data-obsessed organizations; programmable from R & Python[ EB/OL]. http ://scidb. org/,2014- 11-01.
  • 4Antova L, Jansen T, Koch C, et al. Fast and simple relational processing of uncertain data[ A]. 2008 IEEE 24th International Conference on Data Engineering (ICDE'08) [ C ]. Cancun, Mexico : ICDE ,2008.
  • 5Benjelloun O, Das Sarma A, I-Ialevy A, et al. ULDBs: Databases with uncertainty and lineage [ A ]. VLDB'06 Proceedings of the 32nd International Conference on Very large Data Bases[ C ]. Seoul, Korea:2006 VLDB Endowment, 2006 : 953 -964.
  • 6Sen P, Deshpande A. Representing and querying correlated tuples in probabilistic databases [ A ]. 2007 IEEE 23rd International Conference on Data Engineering ( ICDE, 2007 ) [ C ]. Istanbul, Turkey: IEEE, 2007 : 596-605.
  • 7Jordan M. Learning in graphical models [ M ]. Cambridge, MA, USA : MIT Press, 1998.
  • 8Ge T, Zdonik S. Handling uncertain data in array database systems [ A ]. IEEE 24th International Conference on Data Engineering ( ICDE 2008 ) [ C ]. Cancun, Mexico : IEEE ,2008 : 1140-1149.
  • 9Ge Tingjian, Zdonik S. Handling uncertain data in array database systems [ A ]. IEEE 24th International Conference on Data Engineering ( ICDE 2008 ) [ C ]. Cancun, Mexico : ICDE,2008 : 1140-1149.
  • 10Jampani R, Xu Fei, Wu Mingxi, et al. MCDB : A Monte Carlo approach to managing uncertain data [ A ]. Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data ( SIGMOD' 08 ) [ C ]. New York, NY, USA : ACM press ,2008.

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部