Slack-Decode Simultaneously and Redundantly Threaded (SD-SRT) is proposed for detecting transient faults in processors. SD-SRT boosts the previously proposed SRT performance via definitely eliminating redundant inst...Slack-Decode Simultaneously and Redundantly Threaded (SD-SRT) is proposed for detecting transient faults in processors. SD-SRT boosts the previously proposed SRT performance via definitely eliminating redundant instructiou fetches. First, the fetch stage is moved out of the Spheres of Replication (SoR), and a unified instruction-fetch-queue (IFQ) is exploited by both the leading and trailing threads. Second, a scheme called slack-decode cooperates with the unified IFQ to harmonize proceeding of the two threads. The simulations show that SD-SRT outperforms original SRT in terms of IPC by 15%, and decreases I-cache access by 42%. Meanwhile, SD-SRT leads to a lessened size and complexity for hardware structures such as load-value-queue and store-buffer.展开更多
In order to eliminate the energy waste caused by the traditional static hardware multithreaded processor used in real-time embedded system working in the low workload situation, the energy efficiency of the hardware m...In order to eliminate the energy waste caused by the traditional static hardware multithreaded processor used in real-time embedded system working in the low workload situation, the energy efficiency of the hardware multithread is discussed and a novel dynamic multithreaded architecture is proposed. The proposed architecture saves the energy wasted by removing idle threads without manipulation on the original architecture, fulfills a seamless switching mechanism which protects active threads and avoids pipeline stall during power mode switching. The report of an implemented dynamic multithreaded processor with 45 nm process from synthesis tool indicates that the area of dynamic multithreaded architecture is only 2.27% higher than the static one in achieving dynamic power dissipation, and consumes 1.3% more power in the same peak performance.展开更多
文摘Slack-Decode Simultaneously and Redundantly Threaded (SD-SRT) is proposed for detecting transient faults in processors. SD-SRT boosts the previously proposed SRT performance via definitely eliminating redundant instructiou fetches. First, the fetch stage is moved out of the Spheres of Replication (SoR), and a unified instruction-fetch-queue (IFQ) is exploited by both the leading and trailing threads. Second, a scheme called slack-decode cooperates with the unified IFQ to harmonize proceeding of the two threads. The simulations show that SD-SRT outperforms original SRT in terms of IPC by 15%, and decreases I-cache access by 42%. Meanwhile, SD-SRT leads to a lessened size and complexity for hardware structures such as load-value-queue and store-buffer.
基金supported partially by the National High Technical Research and Development Program of China (863 Program) under Grants No. 2011AA040101, No. 2008AA01Z134the National Natural Science Foundation of China under Grants No. 61003251, No. 61172049, No. 61173150+2 种基金the Doctoral Fund of Ministry of Education of China under Grant No. 20100006110015Beijing Municipal Natural Science Foundation under Grant No. Z111100054011078the 2012 Ladder Plan Project of Beijing Key Laboratory of Knowledge Engineering for Materials Science under Grant No. Z121101002812005
文摘In order to eliminate the energy waste caused by the traditional static hardware multithreaded processor used in real-time embedded system working in the low workload situation, the energy efficiency of the hardware multithread is discussed and a novel dynamic multithreaded architecture is proposed. The proposed architecture saves the energy wasted by removing idle threads without manipulation on the original architecture, fulfills a seamless switching mechanism which protects active threads and avoids pipeline stall during power mode switching. The report of an implemented dynamic multithreaded processor with 45 nm process from synthesis tool indicates that the area of dynamic multithreaded architecture is only 2.27% higher than the static one in achieving dynamic power dissipation, and consumes 1.3% more power in the same peak performance.