期刊文献+

基于遗传算法的主题信息搜索系统研究 被引量:1

Study on Subject Information Search System Based on Genetic Algorithm
下载PDF
导出
摘要 针对网络信息资源"迷向"与"过载"的现象,本文通过对遗传算法的分析应用,构建了由基于遗传算法的主题爬虫、信息处理和查询服务三部分组成的主题信息搜索系统。实验结果表明,应用该系统可以获取与主题相关度高的网页信息。 The subject information acquisition system is established by applying the genetic algorithm, according to web information overload and resource puzzle. The testing results showed that the web pages which are strong correlation in subject can be catched, and the accuracy of capturing the subject web pages was improved by using the system.
出处 《现代情报》 2009年第3期176-178,181,共4页 Journal of Modern Information
基金 北京市自然科学基金资助项目(4062013):遗传算法在网页信息搜索技术中的应用研究
关键词 主题 遗传算法 爬虫 搜索系统 subject genetic algorithm crawler search system
  • 相关文献

参考文献4

  • 1朱炜,王超,李俊,潘金贵.Web超链分析算法研究[J].计算机科学,2003,30(9):89-93. 被引量:20
  • 2DeBra P, Houben G, Komatzky Y, et al. Information Retrieval in Distributed Hypertexts. Proc 4th RIAO Conference. New York: Computer- assisted Information Retrieval, 1994:481 - 491.
  • 3Herseoviei M, Jacov M, Yoelle S Marek. The Shark - Search Algorithm- An Application: Tailored Web Site Mapping. Ccmputer Networks and ISDN Systems, 1998, 30: 317- 326.
  • 4宋聚平,王永成,尹中航,滕伟.面向主题的网页搜索系统[J].上海交通大学学报,2003,37(3):401-403. 被引量:12

二级参考文献27

  • 1Page L, Brin S, Motwani R, Winograd T. The PageRank Citation Ranking : Bringing Order to the WEB. Jan 1998 and July 2001 at http://www. db. stanford. edu/-backub/PageRanksub. ps.
  • 2Brin S,Page L. The anatomy of a large-scale hypertextual WEB search engine, In: Proc of the Seventh Intl World Wide WEB Conf. 1998.
  • 3Richardson M,Domingos P. The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank, volume 14. MIT Press, Cambridge, MA, 2002.
  • 4Haveliwala T H. Topic-Sensitive PageRank. In:Proc of the Eleventh Intl World Wide WEB Conf. 2002.
  • 5Kleinberg J. Authoritative sources in a hyperlinked environmerit. In.. Proc 9th ACM-SIAM Symposium on Discrete Algorithms, 1998. Extended version in Journal of the ACM 46(1999). Also appears as IBM Research Report RJ 10076, May 1997.
  • 6Chakrabarti S,et al. Hypersearching the WEB. Scientific American. June 1999.
  • 7Henzinger M R,Bharat K. Improved algorithms for topic distillation in a hyperlinked environment. In:Proc of the 21'st Intl ACMSIGIR Conf on Research and Development in IR, Aug. 1998.
  • 8Lempel R,Moran S. The Stochastic Approach for Link-Structure Analysis (SALSA) and the TKC Effect. In:Porc 9 th Intl WorldWide WEB Conf. 2000.
  • 9Chakrabarti S, et al. Mining the WEB's link structure. IEEE Computer, Aug. 1999.
  • 10Chakrabarti S,et al. Automatic resource compilation by analyzing hyperlink structure and associated text. In:Proc 7th Intl WWW Conf. 1998.

共引文献29

同被引文献11

  • 1凌波,周水庚,周傲英.P2P信息检索系统的查询结果排序与合并策略[J].计算机学报,2007,30(3):405-414. 被引量:13
  • 2Broder A. A taxonomy of Web search[C]//SIGIR Forum. New York, N Y, USA: ACM Press, 2002 : 3-10.
  • 3Rose D E, Levinson D. Understanding user goals in web search [C] //WWW ' 04 : Proceedings of the 13the international confe- rence on World Wide Web. New York, N Y, USA: ACM Press, 2004: 13-19.
  • 4Jansen B J,Booth D L,Spink A. Determining the user intent of Web search engine queries[C] // Williamson CL, Zurko ME, Patel-Schneider PF,et al. , eds. Proc. of the 16th Int'l Conf. on World Wide Web. New York: ACM Press, 2007:1149-1150.
  • 5Ricardo A, Liliana C B, Cristina N. The intention behind Webqueries[C]//Crestani F, Ferragina P, Sanderson M, eds. Proc. of the 13th Int'l Conf. on String Processing and Information Re- trieval (SPIRE 2006 ). Berlin, Heidelberg: Springer-Verlag, 2006 :98-109.
  • 6Qi G, Eugene A. Exploring mouse movements for inferring que- ry intent[-C]//Myaeng SH, Oard DW, Sebastianj F, et al. , eds. Proc. of the 31st Annual Int' 1 ACM SIGIR Conf. on Research and Development in Information Retrieval. 2008:707-708.
  • 7Liu YQ, Fu Y P, Zhang M, et al. Automatic search engine per- formance evaluation with click-through data analysis[C] ffWil- liamson CL, Zurko ME, Patel-Schneider PF, et al. , eds. Proc. of the 16th Int'l Conf. on World Wide Web. New York: ACM Press, 2007 : 1133-1134.
  • 8吴晓晖,宋萍萍,张荣欣.有无查询意图的分类与实现架构模型研究[J].情报科学,2009,27(12):1829-1833. 被引量:6
  • 9王大玲,于戈,鲍玉斌,张沫,沈洲.基于用户搜索意图的Web网页动态泛化[J].软件学报,2010,21(5):1083-1097. 被引量:14
  • 10袁鼎荣,钟宁,张师超.文本信息处理研究述评[J].计算机科学,2011,38(2):9-13. 被引量:11

引证文献1

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部