摘要
针对数据源复杂、实时性强、准确性高和数据类型多样的Web空间环境数据采集任务,提出了一个基于Petri网的信牌驱动式Web数据采集模型。首先,通过引入Petri网的基本要素作为模型的理论基础,研究适合于Web数据采集的建模方法;在此基础上,针对模型的具体应用验证,研究了空间环境数据采集任务服务系统(SEDGSS)的架构设计,对数据源配置子系统、任务控制子系统和任务处理子系统进行具体的实现。实验结果表明,该模型实现了自动化机制和回溯校验机制,并具有良好的易配置性、可重用性和扩展灵活性;该系统7×24小时实时抓取254个复杂的数据源任务,目前正承担着自动化、业务化的空间环境数据采集任务以服务于我国空间环境预报。
In order to scrap the space environment data which is complex,real-time,accurate and diverse,an XINPAIdriven Web scraping model based on Petri net was proposed.Firstly,by intruducing basic elements of Petri net as the theoretical foundation,a modeling method for Web data scraping was investigated.Then,to verify this model,the architecture of Space Environment Data Gather Service System( SEDGSS) was designed.Simultaneously,data source configuring subsystem,task controlling subsystem and task processing subsystem were implemented.The experimental results show that,this model shows automated mechanism and backtracking mechanism,and possesses easy configurability,reusability and expansion flexibility.At the same time,254 complex data sources are scraped in real time and the system undertakes the automatic task of scraping space environment data for forecast.
出处
《计算机应用》
CSCD
北大核心
2016年第A01期252-256,共5页
journal of Computer Applications
基金
装备技术基础项目(ZKKZX20141ZL01)
中科院高技术局项目(YYYJ-1110-01)