摘要
提出了一种基于移动代理的图像搜索引擎(MAISE,Mobile Agent based Image Search Engine)的爬虫系统,系统中爬虫代理运行在远程Web服务器上,它将集中在服务器端的任务如:特征提取、建立索引等分散到远程的Web服务器上并行运行,而且代理个数是可控的,最后将少量的数据回传到服务器端,这不仅提高了效率而且减小了网络传输量.最后对MAISE爬虫系统进行了测试,实验结果表明,MAISE爬虫的网络数据传输量和爬行时间等指标上均优于传统爬虫.
A mobile agent based crawler system for Image Search Engine is proposed in this paper to address this issue.In our system,the crawlers are implemented as mobile agents that can run on the remote servers,which lead to most computing-intensive tasks,i.e.feature extracting,indexing,can be parallelized and carried out on different remote web servers.Cooperatively executing computing-intensive tasks on different servers by multiple crawlers makes a great improvement in processing speed.Moreover,only necessary processing results need to be transferred among different servers which decrease network traffic obviously.A prototype system is built and performance test demonstrates it outperforms traditional crawler systems.
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2005年第z1期226-228,246,共4页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金
中国教育科研网格计划ChinaGrid资助项目(CG2003-GA001)
关键词
搜索引擎
网络爬虫
移动代理
并行化搜索
image search engine
Web crawler
mobile agent
parallel search