摘要
通用分布式监测系统MS-1是一种面向并行和分布式计算机系统而设计的行为监测和分析系统.它基于事件驱动的监测原理,并采用软硬件混合实现方式和扩充插件的PC机联网的分布式结构.本文主要介绍MS-1硬件结构的实现原理.它的硬件单元包括一个时基发生器,以及一至多个监测数据采集器.监测系统用高精度锁相同步技术实现各监测结点的时钟同步,时钟分辨率达100ns,事件采集速率可达10MHz.该监测系统已运用于华北计算所的大规模并行处理机TJ-MPP样机的软硬件开发、调试过程中.事实上,该监测系统适用于以任何形式联结、协同工作的计算机系统的功能与性能调试以及分析评估.
This paper introduces the principle and implementation technique ofMS-1 which is a universal behavior monitor system for parallel and distributed com-puter system- MS-1 is a distributed monitor system based on the idea of event-driv-en and technique of hybrid implementation. All its hardware units are designed tobe the form of the extension boards for PC computers. Each unit, combining a PCcomputer and some software, form a monitor agent. The monitor agents then con-nect to each other in a LAN to form a distributed monitor system. The hardware u-nits include a time base generator (TBG) and one or several monitor data probes(MDP). Providing a high-speed buffer, each monitor agent has a peak performanceof 10 million events per second. The synchronization of the clock signal of eachmonitor agent is based on a high accuracy phase lock technique, which provides atime resolution of 100ns in a LAN area. This monitor system has been used in theprocess of developing and debugging of the prototype of TJ-MPP which is alarge parallel computer system designed by North China Institute of Computer Tech-nology. In fact , the monitor system can be used in the process of developing, de-bugging and tuning of the computer systems connected or coordinated in any way.
出处
《计算机学报》
EI
CSCD
北大核心
1998年第4期296-301,共6页
Chinese Journal of Computers
基金
国防科技预研基金
关键词
并行计算机
分布式计算机
监测系统
Parallel and distributed computer, event-driven, monitor and performance evaluation, clock synchronization, phase lock technology