摘要
近年来,功耗成为处理器设计领域的关键问题之一.传统应对功耗的方法如DVFS(Dynamic VoltageFrequency Scaling)目前遭遇了收益递减律.随着多核/众核处理器的普及化,片上缓存占有了越来越多的CPU芯片面积和功耗.针对降低功耗的问题,文中提出了通过过滤不必要的缓存路访问来降低缓存动态功耗的方法.该方法包括采用无效访问过滤器(Invalid Filter)来消除对含无效数据块的缓存路的访问;采用指令数据访问过滤器(I/D Filter)来消除对与访问类型(指令或数据)不匹配的数据块所在的缓存路的访问;以及采用tag低位过滤器(Tag-2Filter)来消除对tag低位不匹配的数据块所在的缓存路的访问.文中提出将以上3种方法合并,称为Invalid+I/D+Tag-2Filter,以期取得更好的效果.通过分析和实验验证了3种方法的有效性和互补性.同时,实验也表明,与Invalid+I/D Filter相比,Invalid+I/D+Tag-2Filter在64KB 4路组相联缓存上可以取得19.6%~47.8%(平均34.3%)的效果提升,在128KB 8路组相联缓存上可以取得19.6%~55.2%(平均39.2%)的效果提升;与Invalid+Tag-2Filter相比,Invalid+I/D+Tag-2Filter在64KB 4路组相联缓存上可以取得16.1%~27.7%(平均16.6%)的效果提升,在128KB 8路组相联缓存上可以取得6.9%~44.4%(平均25.0%)的效果提升.
Power has been a big issue in processor design for several years.Conventional popular approaches for addressing this issue like DVFS(Dynamic Voltage Frequency Scaling) now hit the law of diminishing returns.As multi/many-core processors becoming the main stream processors,caches account for more and more CPU die area and power,this paper presents using filtering unnecessary way accesses to reduce dynamic power consumption of caches shared by instruction and data.The methods include using Invalid Filter,which could eliminate accesses to cache ways contained invalid blocks,and I/D Filter,which could eliminate accesses to cache ways contained instruction/data access type mismatch blocks,and Tag-2 Filter,which could eliminate accesses to cache ways contained tag lowest 2 bits mismatch blocks.Since the methods reducing the activities happened in cache architecture,dynamical CPU power could be significantly decreased.In the paper,we also propose combining the above methods together,which is called Invalid+I/D+Tag-2 Filter,in an attempt to achieve better power saving results.We have verified the effectiveness and complementariness of the three proposed methods through analysis and experiments.Also,our evaluations show that,we could obtain 19.6%~47.8%(which is on average 34.3%) improvement on a 64KB-4way set-associative cache and 19.6%~55.2%(which is on average 39.2%) improvement on a 128KB-8way set-associative cache comparing to Invalid+I/D Filter,and 16.1%~27.7%(which is on average 16.6%) improvement on a 64KB-4way set-associative cache and 6.9%~44.4%(which is on average 25.0%) improvement on a 128KB-8way set-associative cache comparing to Invalid+Tag-2 Filter,respectively.
出处
《计算机学报》
EI
CSCD
北大核心
2013年第4期799-808,共10页
Chinese Journal of Computers
基金
国家"九七三"重点基础研究发展规划项目基金(2011CB302501)
国家杰出青年科学基金(60925009)
国家自然科学基金创新研究群体科学基金(60921002)
国家自然科学基金青年基金(61100013
61202059)
北京市科技新星计划(2010B058)
华为资助课题(YBCB2011030)资助~~