摘要
目前,我国的网页数量已经达到三百多亿,并且正以年增长率超过百分之百的速度飞速增长。为了从众多的网页中快速高效准确地找到需要的信息,主题爬虫孕育而生,它从网络上选择用户需要的主题信息下载,为用户提供庞大的数据信息支持。本文研究可定制关键词(即主题)的基于.NET的更加精准有效的网络爬虫的设计和实现。通过实验及实际应用验证,该爬虫的精准率大大高于普通的爬虫。
At present,the number of pages in our country has reached more than thirty billion,and it grows more than one hundred percent every year.For getting the information efficiently and accurately from Internet,subject crawler born.It downloads the useful information from Internet,and provides the huge data for user.This article describes the design and implementation of subject crawler which is based on.NET and can get the information more accurately and effectively.By experiments and practical applications,the subject crawler's accuracy rate is much higher than others.
出处
《计算机与现代化》
2011年第7期52-55,共4页
Computer and Modernization