期刊文献+
共找到7篇文章
< 1 >
每页显示 20 50 100
Taxonomy of Data Prefetching for Multicore Processors 被引量:1
1
作者 Surendra Byna 陈勇 孙贤和 《Journal of Computer Science & Technology》 SCIE EI CSCD 2009年第3期405-417,共13页
Data prefetching is an effective data access latency hiding technique to mask the CPU stall caused by cache misses and to bridge the performance gap between processor and memory. With hardware and/or software support,... Data prefetching is an effective data access latency hiding technique to mask the CPU stall caused by cache misses and to bridge the performance gap between processor and memory. With hardware and/or software support, data prefetching brings data closer to a processor before it is actually needed. Many prefetching techniques have been developed for single-core processors. Recent developments in processor technology have brought multicore processors into mainstream. While some of the single-core prefetching techniques are directly applicable to multicore processors, numerous novel strategies have been proposed in the past few years to take advantage of multiple cores. This paper aims to provide a comprehensive review of the state-of-the-art prefetching techniques, and proposes a taxonomy that classifies various design concerns in developing a prefetching strategy, especially for multicore processors. We compare various existing methods through analysis as well. 展开更多
关键词 taxonomy of prefetching strategies multicore processors data prefetching memory hierarchy
原文传递
Energy Efficient Block-Partitioned Multicore Processors for Parallel Applications
2
作者 祁轩 朱大开 《Journal of Computer Science & Technology》 SCIE EI CSCD 2011年第3期418-433,共16页
Due to the increasing power consumption in modern computing systems, energy management has become an important research area in the last decade. Recently, multicore has emerged to be an energy efficient architecture t... Due to the increasing power consumption in modern computing systems, energy management has become an important research area in the last decade. Recently, multicore has emerged to be an energy efficient architecture that exploits parallelisms in modern applications. However, as the number of cores on a single chip continues to increase, it has been a grand challenge on how to effectively manage the energy efficiency of multicore-based systems. In this paper, based on the voltage island and dynamic voltage and frequency scaling (DVFS) techniques, we investigate the energy efficiency of block-partitioned multieore processors, where cores are grouped into blocks with the cores on one block sharing a DVFS- enabled power supply. Depending on the number of cores on each block, we study both symmetric and asymmetric block configurations. We develop a system-level power model (which can support various power management techniques) and derive both block- and system-wide energy-efficient frequencies for systems with block-partitioned multieore processors. Based on the power model, we prove that, for embarrassingly parallel applications, having all cores on a single block can achieve the same energy savings as that of the individual block configuration (where each core forms a single block and has its own power supply). However, for applications with limited degrees of parallelism, we show the superiority of the buddy-asymmetric block configuration, where the number of required blocks (and power supplies) is logarithmically related to the number of cores on the chip, in that it can achieve the same amount of energy savings as that of the individual block configuration. The energy efficiency of different block configurations is further evaluated through extensive simulations with both synthetic as well as a real life application. 展开更多
关键词 multicore processors dynamic voltage and frequency scaling (DVFS) voltage islands parallel applications
原文传递
A Hybrid Model for Reliability Aware and Energy-Efficiency in Multicore Systems 被引量:1
3
作者 Samar Nour Sameh A.Salem Shahira M.Habashy 《Computers, Materials & Continua》 SCIE EI 2022年第3期4447-4466,共20页
Recently,Multicore systems use Dynamic Voltage/Frequency Scaling(DV/FS)technology to allow the cores to operate with various voltage and/or frequencies than other cores to save power and enhance the performance.In thi... Recently,Multicore systems use Dynamic Voltage/Frequency Scaling(DV/FS)technology to allow the cores to operate with various voltage and/or frequencies than other cores to save power and enhance the performance.In this paper,an effective and reliable hybridmodel to reduce the energy and makespan in multicore systems is proposed.The proposed hybrid model enhances and integrates the greedy approach with dynamic programming to achieve optimal Voltage/Frequency(Vmin/F)levels.Then,the allocation process is applied based on the availableworkloads.The hybrid model consists of three stages.The first stage gets the optimum safe voltage while the second stage sets the level of energy efficiency,and finally,the third is the allocation stage.Experimental results on various benchmarks show that the proposed model can generate optimal solutions to save energy while minimizing the makespan penalty.Comparisons with other competitive algorithms show that the proposed model provides on average 48%improvements in energy-saving and achieves an 18%reduction in computation time while ensuring a high degree of system reliability. 展开更多
关键词 ENERGY-EFFICIENCY safe voltage multicore processors core utilization dynamic voltage/frequency scaling MAKESPAN
下载PDF
Design for Testability Features of Godson-3 Multicore Microprocessor 被引量:2
4
作者 齐子初 刘慧 +1 位作者 李向库 胡伟武 《Journal of Computer Science & Technology》 SCIE EI CSCD 2011年第2期302-313,共12页
This paper describes the design for testability (DFT) challenges and techniques of Godson-3 microprocessor, which is a scalable multicore processor based on the scalable mesh of crossbar (SMOC) on-chip network and... This paper describes the design for testability (DFT) challenges and techniques of Godson-3 microprocessor, which is a scalable multicore processor based on the scalable mesh of crossbar (SMOC) on-chip network and targets high-end applications. Advanced techniques are adopted to make the DFT design scalable and achieve low-power and low-cost test with limited IO resources. To achieve a scalable and flexible test access, a highly elaborate test access mechanism (TAM) is implemented to support multiple test instructions and test modes. Taking advantage of multiple identical cores embedding in the processor, scan partition and on-chip comparisons are employed to reduce test power and test time. Test compression technique is also utilized to decrease test time. To further reduce test power, clock controlling logics are designed with ability to turn off clocks of non-testing partitions. In addition, scan collars of CACHEs are designed to perform functional test with low-speed ATE for speed-binning purposes, which poses low complexity and has good correlation results. 展开更多
关键词 DFT (design for testability) TAM (test access mechanism) multicore processor low power test
原文传递
Using Pipeline Instructions by Parallel Simulation of Mathematical Models
5
作者 Peter Kvasnica Igor Kvasnica 《Journal of Mathematics and System Science》 2012年第9期552-557,共6页
Simulation is an important and useful technique helping users understand and model real life systems. Once built, the models can run proving realistic results. This supports making decisions on a more logical and scie... Simulation is an important and useful technique helping users understand and model real life systems. Once built, the models can run proving realistic results. This supports making decisions on a more logical and scientific basis. The paper introduces method of simulation, and describes various types of its application. The authors used the method of analysis of the creation and implementation of the programme code. The authors compared parallel instruction of computing defined to pipelined instructions. The power of simulation is that a common model can be used to design a large variety of systems. An important aspect of the simulation method is that a simulation model is designed to be repeated in actual computer systems, especially in multicore processors. For this reason, it is important to minimize average waiting time for fetch and decode stage instructions. The objective of the research is to prove that the parallel operation of programme code is faster than sequential operation code on the multi processor architecture. The system modeling uses methods and simulation on the parallel computer systems is very precise. The time benefit gained in simulation of mathematical model on the pipeline processor is higher than the one in simulation of mathematical model on the multi processors computer system. 展开更多
关键词 Decentralization mathematical model in state space simulation parallel programme code multicore processors pipelineinstruction processing.
下载PDF
Wide Operational Range Processor Power Delivery Design for Both Super-Threshold Voltage and Near-Threshold Voltage Computing
6
作者 Xin He Gui-Hai Yan +1 位作者 Yin-He Han Xiao-Wei Li 《Journal of Computer Science & Technology》 SCIE EI CSCD 2016年第2期253-266,共14页
The load power range of modern processors is greatly enlarged because many advanced power management techniques are employed, such as dynamic voltage frequency scaling, Turbo Boosting, and near-threshold voltage (NTV... The load power range of modern processors is greatly enlarged because many advanced power management techniques are employed, such as dynamic voltage frequency scaling, Turbo Boosting, and near-threshold voltage (NTV) technologies. However, because the efficiency of power delivery varies greatly with different load conditions, conventional power delivery designs cannot maintain high efficiency over the entire voltage spectrum, and the gained power saving may be offset by power loss in power delivery. We propose SuperRange, a wide operational range power delivery unit. SuperRange complements the power delivery capability of on-chip voltage regulator and off-chip voltage regulator. On top of SuperRange, we analyze its power conversion characteristics and propose a voltage regulator (VR) aware power management algorithm. Moreover, as more and more cores have been integrated on a singe chip, multiple SuperRange units can serve as basic building blocks to build, in a highly scalable way, more powerful power delivery subsystem with larger power capacity. Experimental results show SuperRange unit offers lx and 1.3x higher power conversion efficiency (PCE) than other two conventional power delivery schemes at NTV region and exhibits an average 70% PCE over entire operational range. It also exhibits superior resilience to power-constrained systems. 展开更多
关键词 voltage regulator power delivery near-threshold computing multicore processor
原文传递
FlexCore: Dynamic Virtual Machine Scheduling Using VCPU Ballooning 被引量:2
7
作者 Tianxiang Miao Haibo Chen 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2015年第1期7-16,共10页
As multi-core processors become the de-facto configuration in modern computers, the adoption of SMP Virtual Machines(VMs) has been increasing, allowing for more efficient use of computing resources. However,because ... As multi-core processors become the de-facto configuration in modern computers, the adoption of SMP Virtual Machines(VMs) has been increasing, allowing for more efficient use of computing resources. However,because of existence of schedulers in both the hypervisor and the guest VMs, this creates a new research problem,viz., double scheduling. Although double scheduling may cause many issues including lock-holder preemption,v CPU stacking, CPU fragmentation, and priority inversion, prior approaches have either introduced new problems and/or addressed the problem incompletely. In this paper, we describe the design and implementation of Flex Core,a new scheduling scheme using v CPU ballooning, which dynamically adjusts the number of v CPUs of a VM at runtime. This essentially eliminates unnecessary scheduling in the hypervisor layer, and thus, boosts performance significantly. An evaluation using a complete KVM-based implementation shows that the average performance improvement for PARSEC applications on a 12-core Intel machine is approximately 52.9%, ranging from 35.4% to79.6%. 展开更多
关键词 virtualization SMP virtual machine multicore processor vCPU ballooning
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部