期刊文献+

搜索引擎技术研究与发展 被引量:53

Research and Development of Search Engine Technology
下载PDF
导出
摘要 介绍搜索引擎技术。首先以工作方式作分类介绍,接着介绍各部分工作原理和技术研究,包括如搜索器策略、检索策略、搜索结果处理、信息检索Agent、多媒体搜索引擎等关键技术。最后展望搜索引擎发展重要方向。 This paper introduces search engine technology. First it categorizes the systems according to its working type, then examines each part's theory and technology. These are the important technologies including robot strategy, searcher strategy, result reorganize, information retrieval agent, multimedia search engine and so on. Finally some future work on research field of search engine is concluded.
出处 《计算机工程》 EI CAS CSCD 北大核心 2005年第14期54-56,104,共4页 Computer Engineering
基金 国家自然科学基金资助项目(60205007) 广东省自然科学基金资助项目(001264 031558) 广东省科技计划基金资助项目(2003C50118)
关键词 搜索引擎 多媒体搜索引擎 信息检索 Search engine Multimedia search engine Information retrieval
  • 相关文献

参考文献10

  • 1Shkapenyuk V, Suel T. Design and Implementation of a High- performance Distributed Web Crawler. In Proceedings of the 18th International Conference on Data Engineering (ICDE'02), San Jose, CA, 2002:357-368
  • 2Cho J, Garcia-Molina H, Page L. Efficient Crawling Through Url Ordering. In 7^th Int. World Wide Web Conference, 1998
  • 3Chakrabarti S, van den Berg M, Dom B. Focused Crawling: A New Approach to Topic-specific Web Resource Discovery. In Proc. of the 8^th Int. World Wide Web Conference (WWW8), 1999
  • 4Rennie J, McCallum A. Using Reinforcement Learning to Spider the Web Efficiently. In Proc. of the Int. Conf. on Machine Learning (ICML),1999
  • 5Spertus E. Parasite: Mining Structural Information on the Web. In : Proc. of the Sixth Int'l World Wide Web Conf. , 1997
  • 6Cho J, Garcia-Molina H. The Evolution of the Web and Implications for an Incremental Crawler. In Proc. of 26th Int. Conf. on Very Large Data Bases, 2000:117-128
  • 7Henzinger M R, Heydon A, Mitzenmacher M, et al. on Near-uniform URL Sampling. In Proc. of the 9^th Int. World Wide Web Conference, 2000
  • 8Raghavan S, Garcia-Molina H. Crawling the Hidden Web. In Proc. of 27^th Int. Conf. on Very Large Data Bases, 2001
  • 9王丽坤,王宏,陆玉昌.文本挖掘及其关键技术与方法[J].计算机科学,2002,29(12):12-19. 被引量:42
  • 10王继成,萧嵘,孙正兴,张福炎.Web信息检索研究进展[J].计算机研究与发展,2001,38(2):187-193. 被引量:118

二级参考文献17

  • 1王继成 邹涛 等.网络信息搜集与出版系统WinGPS.南京大学计算机科学与技术系,科技报告[M].,1999..
  • 2Fayyad U M,Piatetsky-Shapiro G,Smyth P.Adavance in Knowledge Discovery and Data Mining.Cambridge MA: AAAI/MIT Press,1996
  • 3John George H.Enhancements to the data mining process: [Ph.D.Thesis].Stanford University, 1997
  • 4Rao A S.AgentSpeak(L):BDI Agents Speak Out in a Logical Computable Language.In:Proc.Eur.Workshop Model.Auto.Agents Multi-Agent World (MAAMAW-96, 7th), 1996.42~55
  • 5梁南元 郑延斌.一个汉语自动分词模型CWSM及自动分词系统PC—CWSS[J].Communications of COLIPS,1991,1(1):51-55.
  • 6Wang XiaoLong,et al.The Problem of Separating Characters into Fewest Words and Its Algorithms.Chinese Science Bulletin,1989,34 (22): 1924~1928
  • 7Salton G,Wong A,Yang C S.A Vector Space Model for Automatic Indexing.Communication of the ACM 1995,18:613~620
  • 8Mladenic D.Machine Learning on non-homogeneous, distributed text data.Doctoral Dissertation, University of Ljubljana,1998
  • 9McCallum A,Nigam K.A Comparison of Event Models for Naive Bayes Text Classification.Just Research 4616 Henry Street Pittsburgh,PA 15213
  • 10McCallum A,Nigam K.Text Classification by Bootstrapping with Keywords, EM and Shrinkage.Just Research 4616 Henry Street Pittsburgh, PA 15213

共引文献158

同被引文献377

引证文献53

二级引证文献970

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部