期刊文献+

“魔方-3”高性能计算机运维管理平台设计与实现 被引量:1

Design and implementation of the maintenance and management platform powered by Magic Cube-3 high-performance computer
下载PDF
导出
摘要 随着科技的进步,高性能计算机作为重要的科研基础设施为各行各业的发展提供了有力的支撑保障。确保高性能计算机稳定高效的运行是系统管理员的希冀也是职责所在。主要介绍了以“魔方-3”高性能计算机为对象开发的运维管理平台,包括平台架构设计、底层数据采集接口和方式,以及该平台实现的系统监控、自动巡检、数据分析等多种功能。借助这个平台系统管理员能直观清晰地了解计算机运行状况,及时发现并处置故障,通过多角度的数据挖掘分析影响当前运行效率的瓶颈所在,为后续软硬件优化升级提供科学的决策依据。 With the progress of science and technology,high-performance computers,as important infrastructure for scientific research,have provided strong support for the development of various indu-stries.It is administrators’wishes and responsibilities to guarantee that high-performance computers can operate stably and efficiently.This paper mainly introduces the maintenance and management system powered by“magic cube-3”supercomputer.The introduction includes platform structure design,underlying data collection interface and methods,and various functions achieved by the platform including system monitoring,automatic detection and data analysis.This platform enables administrators to directly know the operation status of computers and timely find and handle malfunction.Through collecting and analyzing data from multiple perspectives,administrators can find out bottlenecks that slow down the operation efficiency,thus offering scientific decision-making basis for subsequent optimization and upgrading.
作者 赵奇奇 ZHAO Qi-qi(Shanghai Supercomputer Center,Shanghai 201203,China)
出处 《计算机工程与科学》 CSCD 北大核心 2020年第10期1807-1814,共8页 Computer Engineering & Science
关键词 高性能计算机 运维管理 系统监控 数据分析 high-performance computer maintenance and management system monitoring data ana-lysis
  • 相关文献

参考文献4

二级参考文献37

共引文献11

同被引文献10

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部