XML数据流上Top-K关键字查询处理被引量：8

Efficient Top-K Keyword Search on XML Streams

下载PDF

导出

摘要利用关键字可以在模式未知的情况下对XML数据进行查询.在当前的XML数据流上的关键字查询处理中,打分函数往往不能都满足各种用户不同的需求.提出了一种基于skyline的XML数据流上的Top-K关键字查询.对于这种查询,不需要考虑影响结果与查询相关性的复杂因素,只需利用skyline挑选与查询最相关的结果.提出了两种XML数据流上的有效的基于skyline的Top-K关键查询处理算法,包括对单查询和多查询的处理算法.通过扩展实验对两种算法的有效性和可扩展性进行了验证.经过实验验证,所提出的查询处理算法的效率几乎不受关键字个数、查询结果数量、查询数量等参数的影响,运行时间和文档大小大致呈线性关系. Keywords are suitable for query XML streams without schema information. In current forms of keywords search on XML streams and rank functions do not always represent users＇ intensions. This paper addresses this problem in another aspect. In this paper, the skyline Top-K keyword queries, a novel kind of keyword queries on XML streams, are presented. For such queries, skyline is used to choose results on XML streams without considering the complicated factors influencing the relevance to queries. With skyline query processing techniques, two techniques, are presented to process skyline Top-K keyword single queries and multi-queries on XML streams efficiently. Extensive experiments are performed to verify the effectiveness and efficiency of these techniques presented in this paper. According to the experimental results, the algorithms are not sensitive to the parameters such as the number of keywords, the number of results, the number of queries, and the runtime is approximately linear to the size of document.

作者黎玲利王宏志高宏李建中

机构地区哈尔滨工业大学计算机科学与技术学院

出处《软件学报》 EI CSCD 北大核心 2012年第6期1561-1577,共17页 Journal of Software

基金国家自然科学基金(61003046 61111130189) 国家重点基础研究发展计划(973)(2012CB316200) 高等学校博士学科点专项科研基金(20102302120054)

关键词 XML 数据流关键字查询 TOP-K SKYLINE XML streams keyword search Top-K skyline

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献26

1Hristidis V, Papakonstantinou Y, Balmin A. Keyword proximity search on XML graphs. In: Dayal U, et al.eds. Proc. of the 19th Int'l Conf. on Data Engineering. Bangalore: IEEE Computer Society, 2003. 367-378.
2Barg M, Wong RK. Structural proximity searching for large collections of semi-structured data. In: Proc. of the 2001 ACM CIKM Int'l Conf. on Information and Knowledge Management. Atlanta: ACM Press, 2001. 175-182. [doi: 10.1145/502585.502615].
3Cohen S, Mamou J, Kanza Y, Sagiv Y. XSEarch: A semantic search engine for XML. In: Freytag J, Lockemann P, et al., eds. Proc. of 29th Int'l Conf. on Very Large Data Bases. Berlin: Morgan Kaufmann Publishers, 2003.45-56.
4Guo L, Shao F, Botev C, Shanmugasundaram J. XRANK: Ranked keyword search over XML documents. In: Halevy AY, Ives Z, Doan AH, eds. Proc. of the 2003 ACM SIGMOD Int'l Cotff. on Management of Data. San Diego: ACM Press, 2003. 16-27. [doi: 10.1145/872757:872762].
5Borzsonyi, Kossmann D, Stocker K. The skyline operator. In: Proc. of the 17tb Int'l Conf. on Data Engineering. Heidelberg: IEEE Computer Society, 2001.421-430. [doi: 10.1109/ICDE.2001.914855].
6Tatarinov I, Viglas S, Beyer K, Shanmugasundaram J, Shekita E, Zhang C. Storing and querying ordered XML using a relational database system. In: Franklin M J, Moon B, Ailamaki A, eds. Proe. of the 2002 ACM S1GMOD Int'l Conf. on Management of Data. Madison: ACM Press, 2002.204-215. [doi: 10.1145/564691.564715 ].
7FrcdkinE.Triememory.C0mmunicationoftheAcM,1960,3(9):490一499.[doi:10.1145/367390-367400].
8Aoe J. An efficient digital search algorithm by using a double-array structure. IEEE Trans. on Software Engineering, 1989,15(9): 1066-1077. [doi: 10.1109/32.31365].
9Chomicki J, Godfrey P, Gryz J, Liang D. Skyline with presorting. In: Dayal U, Ramamritham K, Vijayaraman TM, eds. Proc. of the 19th Int'l Conf. on Data Engineering. Bangalore: IEEE Computer Society, 2003. 717-816. [doi: 10.1109/ICDE.2003.1260846].
10Godfrey P, Shipley R, Gryz J. Maximal vector computation in large data sets. In: B6hm K, Jensen CS, et al.eds. Proc. of the 31st Int'l Conf. on Very Large Data Bases. Trondheim: ACM Press, 2005. 229-240.

同被引文献85

1李婷,李昕,孟祥福.Rtop-k:基于结构松弛的XML关键字近似查询方法[J].计算机科学,2012,39(S3):185-190. 被引量：2
2马建刚,黄涛,汪锦岭,徐罡,叶丹.面向大规模分布式计算发布订阅系统核心技术[J].软件学报,2006,17(1):134-147. 被引量：128
3Mouratidis K,Yiu M L,Papadias D. Continuous nearest neighbor monitoring in road networks[A].{H}New York:ACM Press,2006.43-54.
4Gao Y J,Zheng B H,Chen G C. Continuous visible nearest neighbor query processing in spatial databases[J].VLDBJ,2011,(03):371-396.
5Li G H,Li Y H,Li J J. Continuous reverse k nearest neighbor monitoring on moving objects in road networks[J].{H}Information Systems,2010,(08):860-883.
6Felipe I E,Hristidis V,Rishe N. Keyword search on spatial databases[A].Cancun:IEEE,2008.656-665.
7Cong G,Jensen C S,Wu D. Efficient retrieval of the top-k most relevant spatial web objects[A].Lyon:ACM,2009.337-348.
8Wu D,Yiu M L,Jensen C S. Efficient continuously moving top-k spatial keyword query processing[A].Hannover:IEEE,2011.541-551.
9Cao X,Cong G,Jensen C S. Collective spatial keyword querying[A].Athens:ACM Press,2011.373-384.
10Joao B,Rocha J,Kjetil N. Top-k spatial keyword queries on road networks[A].{H}Berlin:Springer-Verlag,2012.168-179.

引证文献8

1李艳红,李国徽,张聪.路网中空间关键字连续k近邻查询算法研究[J].华中科技大学学报（自然科学版）,2013,41(12):54-58. 被引量：3
2钱立兵,季振洲,吴昊.一种改进的分布式搜索引擎模型[J].哈尔滨工业大学学报,2014,46(7):8-13. 被引量：1
3李艳红,李国徽,周斌.路网移动对象空间关键字连续Top-k查询[J].华中科技大学学报（自然科学版）,2014,42(6):127-132. 被引量：2
4李艳红,李国徽,黄群.基于数据广播的空间关键字查询处理[J].华中科技大学学报（自然科学版）,2015,43(1):122-126.
5王桐,陶雪玲.基于事件流的近似语义通信中间件技术研究[J].信息技术,2016,40(1):83-86. 被引量：1
6崔婉秋,李昕,孟祥福,崔岩,王大伟.XML中支持top-k的关键字查询方法研究[J].辽宁工业大学学报（自然科学版）,2016,36(3):144-149.
7陆佳炜,卢成炳,王辰昊,肖刚.基于USDR模型的云推荐方法研究[J].计算机测量与控制,2018,26(8):227-232. 被引量：1
8陆佳炜,王辰昊,肖刚,徐俊.面向多源异构数据的云推送平台的研究与应用[J].计算机科学,2016,43(S1):533-537. 被引量：2

二级引证文献10

1刘喜平,万常选,刘德喜,廖国琼.空间关键词搜索研究综述[J].软件学报,2016,27(2):329-347. 被引量：19
2李艳红,李国徽.空间网络数据库关键字查询的高效空中索引[J].华中科技大学学报（自然科学版）,2016,44(8):41-45.
3张素智,丁温雪,徐家兴.支持多子串近似匹配的空间关键词查询算法[J].湖北民族学院学报（自然科学版）,2016,34(3):241-245. 被引量：1
4张素智,徐家兴,魏萍萍.面向空间多关键词的近似匹配查询算法[J].计算机工程与设计,2017,38(8):2167-2172. 被引量：2
5杨刚,高伟,薛挺.基于云平台的社区老人监护系统的设计与实现[J].现代电子技术,2018,41(20):14-17. 被引量：3
6李涛,刘俊宏,刘寰.气象观测数据定制化推送系统[J].电子技术与软件工程,2018(24):136-136.
7焦梦蕾,徐勇,赵涛,武雅利,许崇.个性化动态推荐相关技术研究[J].现代信息科技,2019,3(8):7-9.
8孙浩然,武雪明,吉雪芸.高考志愿智能推荐系统的设计与实现[J].电脑知识与技术,2023,19(9):41-45.
9孟祥福,赖贞祥,崔江燕.集合空间关键字内聚组查询方法[J].智能系统学报,2024,19(3):707-718.
10冯培禄.基于语义物联网的多领域的信息互操作方式的研究[J].数码世界,2019(1):65-65.

1InDesign插件[J].计算机光盘软件与应用（COMPUTER ARTS数码艺术）,2005(10):113-116.
2王霞.Word文档的“保存”方式[J].信息化建设,2005(8):59-60.
3李玲.为什么两个完全相同的Word文档大小不一样[J].计算机应用文摘,2001(7):94-94.
4王霞.Word文档的“保存”方式[J].网络与信息,2005,19(5):74-74.
5问诊台[J].网管员世界,2010(1):123-123.
6杜刚,姜景捷,闫玉攀,李一,张博.单片机串行总线扩展实验系统的设计[J].实验室研究与探索,2005,24(6):36-38. 被引量：1
7郭丽莎.办公软件中图片压缩技巧[J].广西烟草,2007,0(7):43-44.
8酷网千寻[J].电脑爱好者,2010(12):39-39.
9杨志强.数字电路实验在微机上的实现[J].青海师范大学学报（自然科学版）,2004,20(3):36-39.
10邓春健,吕燚,李文生,黄杰勇.利用FPGA实现DSP功能扩展实验[J].实验科学与技术,2009,7(1):81-84.

软件学报

2012年第6期

浏览历史

内容加载中请稍等...

XML数据流上Top-K关键字查询处理被引量：8

参考文献26

同被引文献85

引证文献8

二级引证文献10

相关作者

相关机构

相关主题

浏览历史

XML数据流上Top-K关键字查询处理 被引量：8

参考文献26

同被引文献85

引证文献8

二级引证文献10

相关作者

相关机构

相关主题

浏览历史

XML数据流上Top-K关键字查询处理被引量：8