摘要
随着人们对信息资源的个性化需求不断加大,主题网络爬虫应时而生。阐述主题网络爬虫定义及工作原理;介绍了主题网络爬虫研究进展,对主题网络爬虫爬行策略、网页抓取优先级以及系统设计实现进行阐述;总结当前研究的不足,对未来研究方向进行了展望。
With the increase of people’s personalized demand for information resources,topic-focused web crawler emerged at the right time.The topic-focused web crawler and its working principle are stated.The research progress of theme web crawler is systematically analyzed,and three fields of topic-focused web crawler crawling strategy,web page crawling priority and design and implementation oftopic-focused web crawler system are expounded.The deficiencies of current research are summarized and the future research direction is prospected.
作者
左薇
张熹
董红娟
于梦君
ZUO Wei;ZHANG Xi;DONG Hong-juan;YU Meng-jun(School of Professional and Continuing Education,Yunnan University;School of Information,Yunnan University,Kunming 650000,China)
出处
《软件导刊》
2020年第2期278-281,共4页
Software Guide
基金
云南大学职业与继续教育学院一般项目(YK1704ZJ)。
关键词
主题网络爬虫
主题爬虫
搜索引擎
topic-focused web crawler
topic-focused crawler
search engine