摘要
作为网络技术发展的产物,网络爬虫能够根据自身逻辑借助网页链接来对网页中所存在的数据信息加以分类收集,并将收集到的信息以数据的形式存储在本地的存储介质中,利用这些爬取到的数据,就能够进行信息的收集与分类等用途,是如今网络信息的主要处理方式之一,也是当前网络搜索的一项核心技术。在网页搜索引擎的建设初期,引擎设计人员就要对网页的信息形式进行优化,提高网页的易用程度,并对网络爬虫进行一定程度的优化。
As a product of the development of network technology, web crawlers can classify and collect the data and information existing in web pages according to their own logic with the help of web page links, and store the collected information in the form of data in local storage media. Using these crawled data, they can collect and classify information, which is one of the main ways of processing network information today, it is also a core technology of current network search. In the early stage of the construction of web search engine, engine designers should optimize the information form of web pages, improve the ease of use of web pages, and optimize web crawlers to a certain extent.
作者
黄燕妮
HUANG Yanni(Quanzhou Institute of Textile and Garment,Quanzhou Fujian 362700)
出处
《软件》
2022年第8期153-155,166,共4页
Software
关键词
网络爬虫
网站
优化策略
web crawler
a website
optimization strategy