摘要
在分析传统网络爬行器爬行算法的基础上,通过将隧道算法和网页页面分块技术相结合,指导专题爬行器进行爬行。通过4所高校门户网站的教育资源搜索实验表明,新的算法可以有效提高搜索效率。
Based on analysis of the traditional Web Crawlers' searching mechanics, this paper combines the tunneling and Web page division with Web Crawler' s searching strategy. Then a dynamic tunneling Web Crawler' s searching algorithm is proposed. Experiments in four university Websites are carried out in allusion to "education resources", and resuits show that the new algorithm outperforms two standard crawlers for focused crawling.
出处
《现代图书情报技术》
CSSCI
北大核心
2008年第6期83-87,共5页
New Technology of Library and Information Service
基金
湖北省教育厅教学研究项目"多层次计算机网络实验教学改革与实践"(项目编号:20070229)的研究成果之一
关键词
爬行器
隧道穿越
网页分块
Web crawlers
Tunneling
Web page division