基于自适应门控时钟的CPU功耗优化和VLSI设计被引量：3

Power optimization and VLSI design of CPU based on adaptive clock-gating

下载PDF

导出

摘要提出了一种CPU的功耗优化方法,即通过自适应时钟门控来解决CPU中由于流水线阻塞、浮点处理器（FPU）和多媒体协处理器空闲所导致的动态功耗浪费.首先,设计了模块级自适应时钟门控单元,并通过芯片内部硬件电路来自动监测上述模块是否空闲,模块空闲时时钟关闭,从而消除了不需要的时钟翻转带来的模块内部动态功耗消耗.然后,将自适应时钟门控单元应用于国产处理器Unicore-2中,对其流水线阻塞、FPU和多媒体协处理器空闲的产生进行功耗优化.最后,基于TSMC 65 nm工艺下已流片芯片的网表和寄生参数文件,通过反标芯片的波形获得电路翻转率,并用Prime Time PX工具进行了功耗仿真.仿真结果表明,利用本方法运行Dhrystone,Whestone和Stream三个典型测试程序时可获得18%-28%的功耗收益,其面积代价可以忽略,并对CPU性能没有影响. A power optimization method of embedded processors based on self-adaptive clock gating is proposed,which can reduce the power waste caused by pipeline stall,FPU（ float point unit） idle and multimedia co-processor idle. First,an adaptive module level clock-gating cell is designed,which can detect automatically whether the status of each module is idle through on-chip hardw are.When the module is idle,its clock is turned off to save the dynamic power caused by unneeded clock toggling. Then,the adaptive clock-gating cell is applied to a domestic CPU（ central processing unit）Unicore-2,and the power caused by pipeline stall,FPU and multimedia co-processor idle is optimized. Finally,based on the netlist and parasitic files of the previously fabricated TSMC 65 nm chip,the chip waveform is annotated to obtain the nets＇ toggle rates,and then the power simulations are performed by the Prime Time PX tool. The results show that an average of 18% to 28% power reduction can be obtained under typical test benchmarks of Dhrystone,Whestone and Stream,with negligible area overhead and no impact on CPU performance.

作者卜爱国余翩翩吴建兵单伟伟

机构地区东南大学国家专用集成电路系统工程研究中心

出处《东南大学学报（自然科学版）》 EI CAS CSCD 北大核心 2015年第2期219-223,共5页 Journal of Southeast University：Natural Science Edition

基金江苏省"青蓝工程"资助项目

关键词低功耗自适应时钟门控流水线阻塞 low power adaptive clock-gating pipeline stall

分类号 TN47 [电子电信—微电子学与固体电子学]

引文网络
相关文献

参考文献9

1Gonzalez R, Horowitz M. Energy dissipation in general purpose microprocessors [J]. IEEE Journal of Solid-State Circuits, 1996, 31(9): 1277-1284.
2Lotfi-Kamran P, Salehpour A A, Rahmani A M, et al. Dynamic power reduction of stalls in pipelined architecture processors[J]. International Journal of Design, Analysis & Tools for Integrated Circuits & Systems, 2011, 1(1):9-4.
3Choi K, Soma R, Pedram M. Dynamic voltage and frequency scaling based on workload decomposition[C]//ACM International Symposium on Low Power Electronics and Design. Newport Beach, CA, USA, 2004: 174-179.
4Jain S, Khare S, Yada S, et al. A 280 mV-to-1.2 V wide-operating-range IA-32 processor in 32 nm CMOS[C]//IEEE International Solid-State Circuits Conference Digest of Technical Papers. San Francisco, CA,USA, 2012: 66-68.
5Chang X, Zhang M, Zhang G, et al. Adaptive clock gating technique for low power IP core in SoC design [C]//IEEE International Symposium on Circuits and Systems. New Orleans, LA, USA, 2007: 2120-2123.
6Simon Tyler A, Ward William A, Boss Alan P. Performance analysis of Intel multiprocessors using astrophysics simulations [J]. Concurrency and Computation: Practice and Experience, 2012,24(2): 155-166.
7Padua David. Encyclopedia of parallel computing [M]. New York: Springer-Verlag, 2011: 127-129.
8Carazo P, Apolloni R, Castro F, et al. L1 data Cache power reduction using a forwarding predictor [J]. Lecture Notes on Computer Science, 2011, 6448: 116-125.
9Miller M, Janik K, Lu S L. Non-stalling counterflow microarchitecture [C]//4th International Symposium on High Performance Computer Architecture. Las Vegas, Nevada, USA, 1998: 120-126.

同被引文献12

1陈黎明,邹雪城,雷铭,付智辉.应用于低功耗SoC的动态时钟管理技术[J].微电子学,2007,37(1):45-48. 被引量：8
2俞海珍,汪鹏君,汪迪生,张会红.Discrete ternary particle swarm optimization for area optimization of MPRM circuits[J].Journal of Semiconductors,2013,34(2):118-123. 被引量：10
3柏娜,冯越,尤肖虎,时龙兴.极低电源电压和极低功耗的亚阈值SRAM存储单元设计[J].东南大学学报（自然科学版）,2013,43(2):268-273. 被引量：5
4孙大鹰,徐申,徐玉珉,孙伟锋,陆生礼.应用于低功耗嵌入式处理器的功耗动态管理策略设计[J].东南大学学报（自然科学版）,2013,43(4):695-700. 被引量：13
5张丽,庄奕琪,赵巍胜,汤华莲.一种适用于自旋磁随机存储器的低压写入电路[J].西安电子科技大学学报,2014,41(3):131-137. 被引量：6
6刘钱,何炎祥,廖希密,陈勇.面向总线的低功耗优化方法探究[J].计算机工程与应用,2014,50(12):42-47. 被引量：3
7李栋,王小力,杨斌,赵长睿.SoC总线的低功耗分支编码方案[J].计算机应用,2014,34(12):3633-3636. 被引量：2
8于宗光,杨兵,魏敬和,单悦尔,曹华锋.百万门级系统芯片低功耗技术研究[J].微电子学,2015,45(2):217-220. 被引量：3
9褚超强,马德,罗小华.一种自适应的动态功耗管理系统[J].电子技术（上海）,2016,0(1):1-6. 被引量：2
10李瑞祥,林志涛,马德.基于总线负载的自适应频率调节系统[J].计算机工程,2017,34(4):52-59. 被引量：2

引证文献3

1喻贤坤,姜爽,王磊,王莉,彭斌.数字集成电路门控时钟可靠性研究[J].电子技术应用,2017,43(1):60-63. 被引量：3
2周清军,刘红侠.TP RAM的低功耗优化设计及应用[J].计算机工程与应用,2017,53(16):237-240.
3林新民,于宗光,杨亮,魏敬和.基于总线负载的SoC自适应时钟频率调节系统[J].微电子学,2019,49(2):220-224.

二级引证文献3

1尹远,黄嵩人.基于ASIC的功耗评估与优化设计[J].电子产品世界,2019,26(4):54-57. 被引量：2
2刘相伟,唐昱,唐丽,刘安康.基于V93k ATE的SoC芯片输出不稳定的测试方法[J].集成电路应用,2020,37(5):43-45. 被引量：1
3杜斐,何嘉文,刘承禹,张骏,田泽.一种高效的时序转换电路设计与实现[J].计算机技术与发展,2021,31(5):96-101. 被引量：2

1于立波.芯片设计中的功耗估计与优化技术[J].中国集成电路,2010,19(6):37-43. 被引量：4
2李思放.中兴通讯路由器提供完善的IP网络解决方案[J].通信世界,2004(6):43-43. 被引量：1
3Mark LaPedus.浅析7nm之后的工艺制程的实现[J].集成电路应用,2017,34(1):50-53. 被引量：3
4台积电和力旺电子在12nm工艺上再度合作[J].中国集成电路,2017,26(3):11-11.
5董策,杨志家.AES加密算法的高速低功耗ASIC设计[J].微计算机信息,2005,21(09X):8-10. 被引量：5
6杜怀庆.IC设计层级中的低功耗技术[J].科技风,2014(3):62-63. 被引量：2
7周立阳,周玉洁.AES算法的快速低功耗ASIC实现[J].信息安全与通信保密,2007,29(2):160-162. 被引量：1
8LI Jie,WAN Xing,WU Jianbing,SHAN Weiwei.Cache Power Optimization Based on Compare-Based Adaptive Clock Gating and Its 65nm SoC Implementation[J].Chinese Journal of Electronics,2017,26(1):128-131. 被引量：1
9胡海涛,钟明琛,陈大为,陈莉.可测性设计测试向量低功耗设计方法[J].电子测量技术,2016,39(11):46-50. 被引量：8
10新脚本工具在90nm工艺流程下实现低功耗设计提供对Synopsys和Cadence设计工具的全面支持[J].今日电子,2006(2):89-89.

东南大学学报（自然科学版）

2015年第2期

浏览历史

内容加载中请稍等...

基于自适应门控时钟的CPU功耗优化和VLSI设计被引量：3

参考文献9

同被引文献12

引证文献3

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于自适应门控时钟的CPU功耗优化和VLSI设计 被引量：3

参考文献9

同被引文献12

引证文献3

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于自适应门控时钟的CPU功耗优化和VLSI设计被引量：3