期刊文献+

概率XML关键字检索排序算法 被引量:1

A Ranking Algorithm of Keyword Search on Probabilistic XML Data
下载PDF
导出
摘要 探讨了针对概率XML文档集中与内容相关的关键字检索结果的排序问题,针对概率XML文档的特征提出了一种新的排序模式.与仅取决于检索结果概率的检索排序算法不同,本文提出的排序算法充分考虑了节点对文档的区分程度、节点描述文档的程度,以及XML文档本身的结构特性,设计了满足以上特征的检索结果排序模型,并针对排序模型提出了新的倒排索引结构.新的排序算法可以快速完成关键字检索,并将最相关的信息提供给用户.模拟数据集实验验证了该方法的有效性. Discusses the problem of efficiently ranking the search results of keyword related only to content on probabilistic XML data.A newranking model is presented according to the characteristic of probabilistic XML data.Unlike the existing ranking algorithms which only depend on the probabilities of retrieval results,the newranking algorithm proposed fully considered the degrees of nodes discriminating and describing the documents and the characteristic of probabilistic XML data.A ranking model of retrieval results which satisfied the above features is designed and a newinverted index structure for the ranking model is proposed.The newalgorithm can accomplish keyword search quickly,so as to provide the most relevant information to the users.The results of simulation experiment showthat the proposed method is effective.
出处 《东北大学学报(自然科学版)》 EI CAS CSCD 北大核心 2016年第8期1095-1099,共5页 Journal of Northeastern University(Natural Science)
基金 国家自然科学基金资助项目(6100024 61332006 U1401256) 国家重点基础研究计划项目(2011CB302200-G) 中央高校基本科研业务费专项资金资助项目(N130504006)
关键词 检索 概率XML数据 SLCA 排序 keyword search probabilistic XML data SLCA(smallest lowest common ancestor) ranking
  • 相关文献

参考文献1

二级参考文献13

  • 1Sebastiani F. Machine learning in automated textcategorization[J]. ACM Computing Surveys,2002,34(1):1-47.
  • 2Tekli J,Chbeir R,Yetongnon K. An overview onXML similarity: background, current trends and fu-ture directions[J]. Computer Science Review,2009,3(3):151-173.
  • 3Xing G,Guo J, Xia Z H. Classifying XML docu-ments based on structure/content similar-ity [C]//The 5 th International Workshop of the Initiative forthe Evaluation of XML Retrieval. Dagstuhl Castle,Berlin, Germany,2007 :444-457.
  • 4Dalamagas T,Cheng T,Winel K J,et al. A meth-odology for clustering XML documents by structure[J]. Information Systems, 2006,31(3) : 187-228.
  • 5Zaki M J, Aggarwal C C. XRules: an effectivestructural classifier for XML data[C]// Proceedingsof the 9th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining, Wash-ington D C, 2003:316-325.
  • 6Wu J W,Tang J. A bottom-up approach for XMLdocuments classification [C]// Proceedings of the12th International Database Engineering and Appli-cations Symposium. Coimbra,Portugal, 2008: 131-137.
  • 7Tagarelli A, Greco S. Semantic clustering of XMLdocuments [J]. ACM Transactions on InformationSystems, 2010,28(1) : 1-56.
  • 8Denoyer L,Gallinari P. The wikipedia XML corpus[J]. ACM SIGIR Forum, 2006,40(1) :64-69.
  • 9Kurt A, Tozal E. Classification of XSLT-generatedweb documents with support vector machines[C]//Proceedings of the First International Workshop onKnowledge Discovery from XML Documents, Singa-pore, 2006 : 33-42.
  • 10EMachine Learning Group at National Taiwan Uni-versity. Liblinear—a library for large linear classifi-cation[DB/OL]. [2010-09-25]. http: // www. csie.ntu. edu. tw/.cjlin/liblinear/.

共引文献5

同被引文献4

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部