摘要
基于大数据的概念,通过爬虫技术从互联网上积累数据是目前获取外部数据的一种常见的方式。许多网站为了避免被机器人访问网页,通常会通过部署验证码的方式来进行访问限制。滑块验证码即为众多验证码型式中的一种。阐述以极验滑块验证码作为研究对象,研究通过图像识别的方式,模拟在爬虫过程中对其进行自动验证的过程。
With the concept of big data put forward,the value of data has been gradually valued by the public.Therefore,it is a common way to obtain external data through crawler technology.In order to avoid being visited by robots,many websites usually restrict their access by deploying captcha.Slider captcha is one of many captcha types.In this paper,the polar block verification code is taken as the research object,and the automatic verification process in the process of crawler is simulated by means of image recognition.
作者
黄骁骏
HUANG Xiaojun(Shanghai Instrument Automatic Control System Inspection and Testing Institute Co.,Ltd.,Shanghai 200233,China)
出处
《电子技术(上海)》
2020年第5期16-19,共4页
Electronic Technology