摘要
目前在互联网场景中使用爬虫框架已成为高效获取数据的一种重要方式,但由于现有爬虫框架针对互联网具体用户、具体场景和用途进行设计,直接用于电力工业互联网存在不适用、不安全等问题。为此,本文提出基于泛型思想的电力工业互联网爬虫框架,结合电力工业互联网的实际需求,从用户、场景、模块设计、使用等各方面进行泛型设计,使该框架不仅具备现有爬虫框架的能力,还能满足电力工业互联网数据爬取的需求。在某集团电力工业互联网生产环境进行验证试验表明,该框架能满足电力工业互联网的不同用户、不同场景的使用和安全要求,也具备现有爬虫框架的特性,泛型设计达到预期效果。
Currently,using crawler framework in internet scene has become an important way to get data efficiently.However,the existing crawler framework is designed for specific users,scenarios and uses of the internet,its direct application in power industry internet will cause problems such as inapplicability and insecurity.Therefore,this article proposes a crawler framework for power industry internet based on generic ideas,combining with the actual needs of the power industry internet,from the user,scene,module design,use and other aspects of generic design.The proposed frame not only has the ability of existing crawler frames,but also can meet the needs of the power industry internet data crawling.A verification test was carried out in a power industry internet production environment of a group,it shows that the proposed framework can meet the requirements of different users,different scenarios as well as the security requirements of the power industry internet.Moreover,it has the characteristics of the existing crawler framework,and the generic design achieves the desired results.
作者
毕玉冰
王文庆
刘超飞
崔逸群
董夏昕
金晶
BI Yubing;WANG Wenqing;LIU Chaofei;CUI Yichun;DONG Xiaxin;JIN Jing(Xi’an Thermal Power Research Institute Co.,Ltd.,Xi’an 710054,China;Beijing China Power Puhua Technology Co.,Ltd.,Beijing 100000,China)
出处
《热力发电》
CAS
北大核心
2020年第11期20-27,共8页
Thermal Power Generation
基金
中国华能集团有限公司总部科技项目(HNKJ20-H04)。