摘要
针对传统的基于单一价值评价的网络爬虫搜索策略存在的不足,提出了一种基于自适应动态演化粒子群(adaptive dynamical evolutional particle swarm optimization,ADEPSO)的启发式网络爬虫搜索算法。本算法综合立即价值和未来价值两种链接评价方法,并依据链接价值所反映的Web实际搜索情况动态调整两种价值的关系,使网络爬虫能更准确地预测页面的重要性。实验表明,该算法具有较高的搜索效率。
Aiming at the disadvatages of traditional topic crawler which uses monistic searching strategy, a new heuristic searching algorithm based on adaptive dynamical evolutionary PSO is proposed, which combines the advantage of linkage's immediate rewards and future rewards to valuate linkages together. The author utilizes the changes of rewards to speculate about how relevant the candidate page-set is to topics based on which the crawler can dynamically adjust the relationship between these two rewards. The experimental results show that this algorithm has better performance compared with traditional algorithms.
出处
《武汉大学学报(信息科学版)》
EI
CSCD
北大核心
2008年第12期1296-1299,共4页
Geomatics and Information Science of Wuhan University
基金
国家自然科学基金资助项目(6047014)
湖北工业大学基金资助项目(200601)
关键词
网络爬虫
自适应动态演化粒子群
立即价值
未来价值
topic crawler
adaptive dynamical evolutional particle swarm optimization
immediate rewards
future rewards