期刊文献+

基于解耦De-skew PLL的处理器低功耗同步间歇时钟系统设计 被引量:2

Synchronized Intermittent Clock System Based on Decouple De-skew Phase Locked Loop Used in Low Power Processor Design
下载PDF
导出
摘要 随着高性能处理器集成度、面积以及工作频率的不断增加,时钟动态功耗呈指数级增加,时钟分布不均导致跨时钟域的同步开销显著增大,这些问题逐渐成为制约处理器能效提升的瓶颈.通常处理器核的功耗占多核处理器整体功耗超过70%,而时钟功耗是处理器核功耗的主要组成部分.数字方式的系统动态调频DFS(Dynamic Frequency Scaling)降频的方法需要触发时钟中断例外重新配置时钟生成模块锁相环的相关寄存器,由此带来系统超过毫秒级等待时间开销;而模拟方式连续自适应调节AFS(Adaptive Frequency Scaling)频率变化过程中存在频率过冲响应会增加物理时序设计压力.与此同时功耗的调节降低要以高性能为前提.片上时钟分布长延时随PVT(Process Voltage Temperature)变化产生的不确定时钟相位偏差,为此物理设计增加时序冗余补偿会直接影响到处理器性能.本文提出了新的基于解耦去偏斜锁相环De-skew PLL(De-skew Phase Locked Loop)的同步间歇时钟系统,采用12 nm CMOS工艺实现了去偏斜锁相环的设计,并对整个系统进行了时序性能和时钟功耗的评估.该系统一方面可以利用去偏斜锁相环的远端时钟反馈技术实现不同时钟域之间的实时相位对齐,同时也可以抵抗反馈环内时钟分布延时随PVT的变化;另一方面可以利用新增加的解耦模块,无频率过冲地响应处理器核内产生的时钟间歇控制(时钟脉冲间断性停拍)信号降频,从而实现亚纳秒级时钟动态功耗控制.以12 nm工艺同步级联结构为例,每层时钟分布校准后同步偏差小于10 ps.使用16核LS3C5000处理器RTL在仿真加速平台上运行SPEC CPU 2000测试集来评估本方案对处理器核时钟功耗的影响,并进一步通过PTPX后仿真验证,结果表明,定点及浮点程序平均功耗节约分别大于4.5%和20.3%. With the increasing of processor’s integration,area and working frequency,clock power consumption is increasing exponentially,and the cost of synchronization across different clock domains becomes serious due to distribution’s non-uniform.Both of these issues have already become the bottleneck that restricts the energy efficiency of the processor.Normally,the processor core’s power consumption accounted for more than 70%of the total power of the multi-core processor,and clock power is the main component of the process core’s power consumption.DFS(Dynamic Frequency Scaling)requires to trigger clock interrupt exceptions and reconfigure PLL(Phase Locked Loop)’s relevant registers,but state shift results in milliseconds level system waiting time.AFS(Adaptive Frequency Scaling)without system control continually adjusts operating frequency by tracking power supply level’s change.While frequency overshoot could not be avoided during the tuning process,which brings extra physical timing constraints.Clock system’s low power design could not be at the expense of processors’performance.Clock distribution’s delay deviates with PVT(Process Voltage Temperature)’s variation.Increasing timing margin to compensate for clock phase differences will directly affect the timing performance of critical paths.In this paper,a new synchronized intermittent clock system based on decoupled De-skew PLL is proposed firstly;subsequently a De-skew PLL which supporting stable phase error calibration is realized in 12 nm CMOS process;finally,the timing performance and clock power consumption are evaluated for the whole system.On the one hand,this new clock system structure not only can realize real-time phase alignment between different clock domains by De-skew PLL’s remote feedback,but also can immune clock tree delay’s PVT variation by real-time in loop tracking;on the other hand,De-skew PLL’s decoupling module can decouple the relationship between clock tree frequency and PLL’s loop configuration without loss lock,which can be used to realize the dynamic power consumption control in sub 1ns response time without frequency over shoot.Taking 12 nm cascade clock system as an example,each stage clock distribution synchronization deviation can be achieved less than 10 ps after software phase calibration.LS3C500016-core processor’s RTL is adopted to run the SPEC CPU 2000 test set on the simulation acceleration platform to evaluate the effect of this new structure on the processor core’s clock power consumption,and further through PTPX post-simulation verification,the results showed that the average power saving of fixed-point program and the floating-point program is greater than 4.5%and 20.3%,respectively.
作者 杨丽琼 吴瑞阳 杨梁 王焕东 YANG Li-Qiong;WU Rui-Yang;YANG Liang;WANG Huan-Dong(State Key Laboratory of Computer Architecture,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049;Loongson Technology Corporation Limited,Beijing 100095)
出处 《计算机学报》 EI CAS CSCD 北大核心 2022年第10期2207-2220,共14页 Chinese Journal of Computers
基金 中国科学院战略性先导科技专项(C类)课题(XDC05020100)项目资助.
关键词 多核处理器 同步间歇时钟系统 解耦去偏斜锁相环 低功耗设计 multi-core processor synchronized intermittent frequency system decoupled De-skew phase locked loop low power design
  • 相关文献

参考文献1

二级参考文献1

共引文献1

同被引文献44

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部