Cache performance tuning tools are conducive to develop program with good locality and fully use cache to decrease the influence caused by speed gap between processor and memory. This paper introduces the design and i...Cache performance tuning tools are conducive to develop program with good locality and fully use cache to decrease the influence caused by speed gap between processor and memory. This paper introduces the design and implementation of a cache performance tuning tool named CTuning, which employs a source level instrumentation method to gather program data access information, and uses a limited reuse distance model to analyze cache behavior. Experiments on 183.equake improve average performance more than 6% and show that CTuning is proficient not only in locating cache performance bottlenecks to guide manual code transformation, but also in analyzing cache behavior relationship among variables, thus to direct manual data reorganization.展开更多
Performance metrics and models are prerequisites for scientific understanding and optimization. This paper introduces a new footprint-based theory and reviews the research in the past four decades leading to the new t...Performance metrics and models are prerequisites for scientific understanding and optimization. This paper introduces a new footprint-based theory and reviews the research in the past four decades leading to the new theory. The review groups the past work into metrics and their models in particular those of the reuse distance, metrics conversion, models of shared cache, performance and optimization, and other related techniques.展开更多
基金Sponsored by the National Natural Science Foundation of China (No.60573141, 60773041)National 863 High Tech- nology Research Program of China (No.2007AA01Z404, 2007AA01Z478)+2 种基金High Technology Research Programme of Jiangsu Province (No.BG2006001)Key Laboratory of Information Technology Processing of Jiangsu Province (kjs06006)Project of NJUPT (NY207135)
文摘Cache performance tuning tools are conducive to develop program with good locality and fully use cache to decrease the influence caused by speed gap between processor and memory. This paper introduces the design and implementation of a cache performance tuning tool named CTuning, which employs a source level instrumentation method to gather program data access information, and uses a limited reuse distance model to analyze cache behavior. Experiments on 183.equake improve average performance more than 6% and show that CTuning is proficient not only in locating cache performance bottlenecks to guide manual code transformation, but also in analyzing cache behavior relationship among variables, thus to direct manual data reorganization.
基金partially supported by the National Natural Science Foundation of China(NSFC)under Grant No.61232008the NSFC Joint Research Fund for Overseas Chinese Scholars and Scholars in Hong Kong and Macao under Grant No.61328201+2 种基金the National Science Foundation of USA under Contract Nos.CNS-1319617,CCF-1116104,CCF-0963759an IBM CAS Faculty Fellowshipa research grant from Huawei
文摘Performance metrics and models are prerequisites for scientific understanding and optimization. This paper introduces a new footprint-based theory and reviews the research in the past four decades leading to the new theory. The review groups the past work into metrics and their models in particular those of the reuse distance, metrics conversion, models of shared cache, performance and optimization, and other related techniques.