期刊文献+

基于链接结构分析的主题搜索 被引量:2

Subject Searching Based on Links Structure Analysis
下载PDF
导出
摘要 针对目前一般文本搜索引擎采用的关键词匹配方法导致搜索效率相对低下的问题,在分析语义相关性的度量方案基础上,利用Wikipedia丰富链接结构所蕴涵信息,提出基于链接结构分析的主题搜索策略.设计了词条相关性算法,用以描述词间距离,并对词条进行相关度的重新排序.实验中引入用户评价机制,并与传统策略搜索结果进行对比.结果证明,该策略在扩大主题覆盖度的同时保证了较高的用户意图识别度. Current text search engines always have low search efficiency due to their keyword matching method.Based on the comparison of previous works,a thematic search strategy is proposed.The main idea of this strategy is grounded on the rich information implicated by the link structure of Wikipedia.It defines a measure of distance between words in terms of DBW,underpinned by computational thematic communities model.In this way,the authors can use this algorithm to rank and reorient the Key words to discover the closest keyword clusters and improve the quality of searching result.Introducing users' appraisal mechanism and making comparison with the traditional search engines' outcomes in experiment prove that the strategy expands the thematic coverage and maintains a high users' intent recognition at the same time.
出处 《北京工业大学学报》 EI CAS CSCD 北大核心 2011年第4期614-618,623,共6页 Journal of Beijing University of Technology
基金 国家自然科学基金资助项目(70671007)
关键词 维基百科 网络聚类 知识发现 Wikipedia network clustering knowledge discovery in databases(KDD)
  • 相关文献

参考文献10

  • 1MARKUS K, DENNY V, MAX V. Wikipedia and the semantic Web: the missing links[ C]//Wikimania 2005. Frankfurt am Main, Germany: Association for Computing Machinery Press ( ACM), 2005 : 117-125.
  • 2MAX V, MARKUS K, DENNY Vrandecic, et al. Semantic Wikipedia[ C]//WWW2006. Edinburgh, Scotland: Association for Computing Machinery Press ( ACM), 2005 : 265-274.
  • 3DAVID A. SHAWN : structure helps a Wiki navigate[ C ]//BTW Workshop WebDB Meets IR. Arlington : AAAI Press, 2005 : 97-108.
  • 4NATALIA K. Automatic ontology extraction for document classification[ D]. Saarbrticken, Germany: Max-Plank-Institute for Computer Science, Saarland University, 2006.
  • 5DANIEL K. Wikisense-mining the Wiki [ C ]//Wikimania 2005. Frankfurt am Main, Germany: Association for Computing Machinery Press ( ACM), 2005 : 254-276.
  • 6CHAKRABARTI S. Data mining for hypertext : a tutorial survey [ C ]//SIGKDD Explorations. Cambridge : MIT Press, 2000 : 113-125.
  • 7JAKOB V. Measuring Wikipedia[ C ] // ISSI 2005. Stockholm, Sweden : Karolinska University Press, 2005 : 21-36.
  • 8FRANCESCO B, ROBERTO B. Network analisis for Wikipedia [ C ] //Wikimania 2005. Frankfurt am Main, Germany: Association for Computing Machinery Press (ACM) , 2005: 334-367.
  • 9SERGEY B, LAWRENCE P. The anatomy of a large-scale hypertextual Web search engine[ J]. Computer Networks and ISDN Systems, 1998, 30(1/7) : 107-1t7.
  • 10JON K. Authoritative sources in a hyperlinked environment, B.l 10076[ R]. New York: IBM, 1997.

同被引文献8

  • 1林海霞,原福永,陈金森,刘俊峰.一种改进的主题网络蜘蛛搜索算法[J].计算机工程与应用,2007,43(10):174-176. 被引量:18
  • 2Pant,F Menczer.Topical Crawling for Business Intelligence[C]//T Koch and I Solvberg.Proc.7th European Conference on Research and Advanced Technology for Digital Libraries(ECDL),series Lecture Notes in Computer Science,Vol.2769.Berlin,2003.
  • 3Aggarwal C,AL-Garawi F,Yu S P.Intelligent crawling on the world wide web with arbitrary Predicate[C]//Hong Kong:Proc of the 10th International World Wide Web Conference,2001.
  • 4Menczer,G Pant,P Srinivasan.Topical Web Crawlers:Evaluating Adaptive Algorithms[J].ACM Transactions on Internet Technology,2004,4(4):378-419.
  • 5Chen Huei Liao,Bor Chen Kuo Kai Chih Pai.Effectiveness of Automated Chinese Sentence Scoring with Latent semantic Analysis[J].The Turkish Online Journal of Educational Technology,2012,11(2):80-87.
  • 6郭景峰,马鑫,代军丽.基于文本—链接模型和近邻传播算法的网页聚类[J].计算机应用研究,2010,27(4):1255-1258. 被引量:3
  • 7赵华军,钟才明,李文,王睿智,苗夺谦.网页搜索结果聚类与可视化[J].南京大学学报(自然科学版),2010,46(5):542-551. 被引量:5
  • 8贺秋芳,曾启杰,蔡延光.挖掘用户标签的增强型社区网页聚类算法[J].微电子学与计算机,2013,30(2):74-77. 被引量:4

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部