摘要
近几年,高能物理合作的分布式计算站点数量越来越多,分布越来越广。物理软件的运行需要有稳定可靠的计算环境和统一的系统软件配置。本文通过对分布式站点统一部署和监控方案进行研究,从分布式架构设计、软件配置的文件同步和监控数据的采集进行了研究和设计,实现站点统一管理和运维。该方案的研究和实现有利于分布式计算站点的统一配置和集中运维,减轻管理和运维成本,保障物理作业在各个站点的可靠运行。
In recently years,there are more and more distributed computing sites joined into researching of high energy physics.Physics analysis software require a more stable and unified computing environment. The article researched distributed architecture and synchronizing sot^ware configuration and monitoring data acquisition,in order to realize unified management and operations.The solution can benefit unified operation and maintenance for distributed sites to ensure physics jobs running correctly.
作者
郑伟
闫晓飞
胡庆宝
Zheng Wei;Yan Xiao Fei;Hu Qing Bao(Institute of High Energy Physics,Beijing 100049,China)
出处
《科研信息化技术与应用》
2018年第3期14-19,共6页
E-science Technology & Application
关键词
分布式站点
统一部署
分布式监控
distributed site
unified deployment
distributed monitor