期刊文献+

基于x-tuple的概率阈值top-k查询算法

Probabilistic Threshold top-k Query Algorithm Based on x-tuple
下载PDF
导出
摘要 不确定数据库中的概率阈值top-k查询是计算元组排在前k位的概率和,返回概率和不小于p的元组,但现有的查询语义没有将x-tuple内的元组进行整体处理。针对该情况,定义一种新的查询语义——概率阈值x-top-k查询,并给出查询处理算法。在该查询语义下采用动态规划方法求取x-tuple内每个元组排在前k位的概率和,对其进行聚集后做概率阈值top-k查询,并利用观察法、最大上限值等剪枝方法进行优化。实验结果表明,该算法平均扫描全体数据集中60%的数据即可返回正确结果集,证明其查询处理效率较高。 Probabilistic threshold top-k query calculation sum of the probability of the tuple ranked top-k and return the tuples whose sum of the probability are at least p. But top-k query does not take x-tuple as a whole, thus a new top-k query semantic probabilistic threshold x-top-k query is defined and an algorithm is given to process it, which uses dynamic method to acquire sum of the probability of the tuple, then process aggregate probabilities with top-k query. It uses several pruning methods like the upper bound method and so on to optimize the algorithm. Experimental result shows that the algorithm return the answer set for scanning about 60% of data set, and it demonstrates that the algorithm is efficient.
出处 《计算机工程》 CAS CSCD 2013年第4期44-47,共4页 Computer Engineering
基金 国家"973"计划基金资助项目"海量信息可用性基础理论与关键技术研究"(2012CB316200) 南北极环境综合考察与评估专项基金资助项目(CHINARE2012-04-07)
关键词 不确定数据库 概率阈值top-k查询 x-元组 动态规划算法 聚集 uncertain database probabilistic threshold top-k query x-tuple dynamic programming algorithm aggregation
  • 相关文献

参考文献10

  • 1Liu Ling. From Data Privacy to Location Privacy: Models and Algorithms[C]//Proc. of the 33rd Int'l Conf. on Very Large Data Bases. New York, USA: ACM Press, 2007: 1429-1430.
  • 2Yi Ke, Li Feifei, Srivastava D, et al. Efficient Processing of Top-k Queries in Uncertain Databases with X- relations[J]. IEEE Trans. on Knowledge and Data Engineering, 2008, 20(12): 1669-1682.
  • 3Soliman M A, Ilyas I F, Chang K C C. Probabilistic Top-k and Ranking-aggregate Queries[J]. ACM Trans. on Database System, 2008, 33(3): 131-136.
  • 4Soliman M A, Ilyas I F. Top-k Query Processing in Uncertain Database[C]//Proc. of ICDE'07. [S. 1.]: IEEE Press, 2007.
  • 5Cheng R. Evaluating Probabilistic Queries over Imprecise Data[C]//Proc. of SIGMOD'03. New York, USA: ACM Press, 2003.
  • 6Ilyas I F, Beskales G, Soliman M. A Survey of Top-k Query Processing Techniques in Relational Database Systems[J]. ACM Computing Surveys, 2008, 40(4): 1-58.
  • 7周傲英,金澈清,王国仁,李建中.不确定性数据管理技术研究综述[J].计算机学报,2009,32(1):1-16. 被引量:185
  • 8Agrawal P, Benjelloun O, Das S A, et al. Trio: A System for Data Uncertainty and Linage[C]//Proc. of the 32nd Int'l Conf. on Very Large Data Bases. [S. 1.]: ACM Press, 2006.
  • 9刘德喜,万常选,刘喜平.不确定数据库中基于x-tuple的高效Top-k查询处理算法[J].计算机研究与发展,2010,47(8):1415-1423. 被引量:4
  • 10Hua Ming, Pei Jian, Zhang Wenjie, et al. Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach[C]// Proc. of SIGMOD'08. New York, USA: ACM Press, 2008: 9-12.

二级参考文献115

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2谷峪,于戈,张天成.RFID复杂事件处理技术[J].计算机科学与探索,2007,1(3):255-267. 被引量:54
  • 3崔逊学,方红雨,朱徐来.传感器网络定位问题的概率特征[J].计算机研究与发展,2007,44(4):630-635. 被引量:14
  • 4Deshpande A, Guestrin C, Madden S, Hellerstein J M, Hong W. Model-driven data acquisition in sensor networks// Proceedings of the 30th International Conference on Very Large Data Bases. Toronto, 2004:588-599
  • 5Madhavan J, Cohen S, Xin D, Halevy A, Jeffery S, Ko D, Yu C. Web-scale data integration: You can afford to pay as you go//Proceedings of the 33rd Biennial Conference on Innovative Data Systems Research. Asilomar, 2007:342-350
  • 6Liu Ling. From data privacy to location privacy: Models and algorithms (tutorial)//Proceedings of the 33rd International Conference on Very Large Data bases. Vienna, 2007: 1429- 1430
  • 7Samarati P, Sweeney L. Generalizing data to provide anonymity when disclosing information (abstract)//Proeeedings of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. Seattle, 1998:188
  • 8Cavallo R, Pittarelli M. The theory of probabilistic databases//Proceedings of the 13th International Conference on Very Large Data Bases. Brighton, 1987:71-81
  • 9Barbara D, Garcia-Molina H, Porter D. The management of probabilistic data. IEEE Transactions on Knowledge and Data Engineering, 1992, 4(5): 487-502
  • 10Fuhr N, Rolleke T. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Transactions on Information Systems, 1997, 15(1): 32-66

共引文献186

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部