摘要
同时可记录的处理器硬件事件数量受限于处理器硬件性能计算器的数量.目前主流处理器可支持大量(数百个)硬件事件,但由于片上寄存器数量有限,仅提供了少量(通常6~12个)硬件性能计数器.为缓解这一矛盾,硬件计数器复用技术(multiplexing,MPX)通过分时复用策略,利用少量计算寄存器来估算大量硬件事件.但在实践中,由于已有基于时间局部性的MPX估计算法结果准确率偏低,导致MPX一直未被广泛采用.为了提升MPX结果准确率,主要工作包括3部分:1)通过Kolmogorov-Smirnov正态性检验,发现针对同一硬件事件,相同代码在单计数器记录单事件(one counter one event,OCOE)的OCOE模式和MPX模式下,存在数据分布一致性的规律;2)基于此规律,提出了轮廓线估计法(outline estimation,OLE);3)在开源MPX库NeoMPX上实现了OLE算法,并在主流X86和ARM处理器上进行了验证.实验结果表明:在对16个硬件事件同时进行采集时,OLE算法相比PAPI默认的MPX估计算法,结果准确率平均提高了10.5%左右,最多可提升46.6%;相比已有算法,结果准确率分别提升了18.8%和17.7%.
The number of processor hardware events can be collected simultaneously and is limited by the number of processor hardware performance counters.Modern CPUs support hundreds of low-level hardware events,while only offer a small number(usually 6~12)of hardware performance counters(to collect these hardware events)due to limited register resource.To deal with this problem,multiplexing(MPX)is proposed to estimate simultaneously collected hardware events under the constrain of limited hardware counters.However,the low-accuracy of existing time-locality-based estimation algorithms prevents MPX from wide usage in real conditions.In order to improve the MPX accuracy,we design a new estimation algorithm.Our work includes three parts:1)we characterize the distribution of MPX results and one counter one event(OCOE)by Kolmogorov-Smirnov test and find the distribution consistency of MPX results;2)we propose a new distribution-consistency-based estimation algorithm for MPX,outline estimation(OLE);3)we validate OLE within the open-source MPX library NeoMPX on the mainstream X86 and ARM processors.The results show that,for simultaneously collecting 16 processor hardware events,OLE can improve up to 46.6%accuracy than the PAPI default MPX estimation algorithm and achieve 18.8%and 17.7%higher accuracy than the other four state-of-art MPX estimation algorithms respectively.
作者
林新华
王杰
王一超
左思成
Lin Xinhua;Wang Jie;Wang Yichao;Zuo Sicheng(High Performance Computing Center,Shanghai Jiao Tong University,Shanghai 200240)
出处
《计算机研究与发展》
EI
CSCD
北大核心
2022年第6期1192-1201,共10页
Journal of Computer Research and Development
基金
国家自然科学基金项目(62072300)。
关键词
处理器硬件性能计数器
复用技术
性能分析
高性能计算
估计方法
processor hardware performance counters
multiplexing(MPX)
performance profiling
high performance computing
estimation method