期刊文献+

基于希尔伯特R树和LDA的混合机制研究 被引量:1

Hybrid Index Mechanism Based on Hilbert R Tree and LDA Model
下载PDF
导出
摘要 现有的空间关键词搜索方法通常采用以R树为主的混合索引,根据查询位置找到相关文本,查询时通过编辑距离或统计语言模型进行简单的文本匹配。然而多维R树的空间区域重叠率较高,且简单的文本匹配易造成语义相关的文本丢失。为了提高空间查询效率和文本匹配的准确率,构建了一种有效的混合索引结构希尔伯特信息检索树(Hilbert Retrieving information-Tree,HRI-Tree)并进行top-k查询,在Hilbert R树的节点中加入关键词的倒排索引,并采用LDA主题模型,通过主题分类更准确地查询到语义相关的文本,返回与查询文本近似匹配且空间距离相近的top-k结果。上述算法在实验中与当前的方法在查询所需时间、节点重叠覆盖率、文本匹配的准确率等方面进行了比较,显示出其优越的性能。 The recent report points out a kind of query called spatial-keyword retrieval which establishes a hybrid index based on R-tree to find the relevant text according to the position,and find out the similar text through editing distance or statistical language model.However,the multi-dimensional R-tree might achieve a high spatial area overlap ratio and simple text matching will result in semantically related text loss.In order to improve the efficiency of spatial query and the accuracy of text matching,the effective hybrid index structure named Hilbert Retrieving Information Tree(HRI-Tree)was constructed,which adds the inverted index of the keyword to the node of the Hilbert R-Tree and top-k query.Moreover,the LDA topic model was used to dig out the text more accurately through subject classification and return a top-k result that closely matches the query.The algorithm was experimentally compared with the current method in the aspects of runtime,overlapping coverage of MBR,the accuracy of text matching and so on,then shows its superior performances.
作者 徐艺丹 韩京宇 XU Yi-dan;HAN Jing-yu(College of Computer Science and Technology,Nanjing University of Posts and Telecommunications,Nanjing Jiangsu 210000,China;Jiangsu Key Laboratory of Big Data Security&Intelligent Processing,Nanjing Jiangsu 210000,China;Key Laboratory of Computer Network and Information Integration,Ministry of Education,Southeast University,Nanjing Jiangsu 210000,China)
出处 《计算机仿真》 北大核心 2019年第12期415-420,共6页 Computer Simulation
基金 国家自然科学基金项目(61602260) 东南大学计算机网络和信息集成教育部重点实验室(K93-9-2015-07C) 江苏省自然科学基金面上项目(BK20171447) 江苏省高校自然科学研究面上项目(17KJB520024)
关键词 空间关键词 主题模型 混合索引 查询 Spatial-keyword Topic model Hybrid index structure Query
  • 相关文献

参考文献3

二级参考文献31

  • 1陈翀,陈楚南,孙未未.无线数据广播环境下的空间关键字查询[J].计算机研究与发展,2013,50(S1):145-153. 被引量:4
  • 2戴健,许佳捷,刘奎恩,武斌,丁治明.DKR-Tree:一种支持动态关键字的空间对象索引树[J].计算机研究与发展,2013,50(S1):163-170. 被引量:2
  • 3Faloutsos C. FastMap: A Fast Algorithm for indexing, Data-Min ing and Visualization of Traditional and Multimedia Datasets. In:Proc. of ACM SIGMOD, 1995. 163~174
  • 4Jagadish H V. A retrieval technique for similar shapes. In:Proc. ACM SIGMOD Conf, May 1990. 208~217
  • 5Torgerson S. Multidimensional scaling: I. theory and method. Psychometrika, 1952,17: 401~419
  • 6Kruskal J B, Wish M. Multidimensional scaling. SAGE publications, Beverly Hills, 1978
  • 7Ding C. Cluster merging and splitting in hierarchical clustering al gorithms. In:IEEE Intl. Conf. on Data Mining (ICDM'02), Dec. 2002. 139~146
  • 8Zhou Y, Xie X, Wang C, Gong Y, Ma W-Y. Hybrid index structures for location-based web search//Proceedings of the CIKM. Bremen, Germany, 2005 :155-162.
  • 9Chen YY, Suel T, Markowetz A. Efficient query processing in geographic web search engines//Proceedings of the SIGMOD. Chicago, IL, 2006:277-288.
  • 10Felipe I D, Hristidis V, Rishe N. Keyword search on spatial databases//Proeeedings of the ICDE. Caneun, Mexico, 2008:656-665.

共引文献78

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部