摘要
程序调试工作的首要基础是错误可重现,然而并行程序执行过程存在天然的不确定性,尤其在多核处理器上,如何重现并行程序的错误是一个巨大的挑战.现有的方法或记录整个系统的状态或需要细粒度插桩,存在可用性差与运行时开销大等问题.本文首次提出一种基于硬件辅助的面向用户态并行程序的轻量级记录与重放方法,该方法通过软件协助来记录信号、系统调用与操作系统调度相关的序关系;利用硬件记录访存冲突,同时在记录过程中采用基于目录的方法来压缩日志存储.通过在16核模拟平台上评估,本文提出的方法不仅方便了用户态并行程序调试,同时减少了81%的存储日志开销.
Bug reproduction is critical to debug software. But parallel programs are born with non-determinism, because of which reproducing a concurrency bug on CMP becomes a big challenging. Previous work either brings in large runtime overhead or is impractical, so that the paper proposes a lightweight hardware assisted approach to record and replay parallel program. In the approach, software and hardware cooperate to record non-determinism, including: system call, signal, special instruction and memory conflict. Furthermore, a compression technology based on directory is used to reduce log size and a replay algorithm based on the recording log is proposed. Experiment results show that our approach not only can provide a convenient approach for application programmers but also can reduce log size by 81%.
出处
《小型微型计算机系统》
CSCD
北大核心
2012年第10期2243-2248,共6页
Journal of Chinese Computer Systems
基金
国家"九七三"重点基础研究发展计划项目(2011CB302501)资助
国家"八六三"高技术研究发展计划项目(2012AA010303)资助
国家自然科学基金项目(60925009
60921002
61100015
61070025)资助
华为合作项目(YBCB2011030)资助
关键词
多核
并行程序
确定性重放
访存冲突
multi-core
parallel program
determinism
memory conflict