摘要
针对目前影响爬虫程序效率的诸多关键因素,在研究爬虫程序内部运行机理的基础上,进行架构优化,改进爬虫程序中的相关算法。在Linux网络环境下,通过对实现的爬虫程序运行进行检测,反馈出该解决方案和改进之处具有可行性,提高了页面抓取的效率和爬虫程序的整体性能。
In view of current key aspects that affect the crawler system efficiency, through research of crawler system interior movement mechanism, this paper optimizes the overhead construction and improves its algorithm. In the Linux network environment, through movement examination of the crawler system, it may feed back several kinds of solutions and improvement place which are feasible, and it also enthanees the efficiency and the crawler system overall performance.
出处
《计算机工程》
CAS
CSCD
北大核心
2010年第1期280-282,共3页
Computer Engineering