期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
应用IA MMX^(TM)技术的离散余弦变换
1
作者 李维钊 王广伟 《电子技术应用》 北大核心 2000年第7期11-14,共4页
在简要介绍IA(IntelArchitecture)MMX^(TM)技术的基础上,重点讨论了应用IAMMX^(TM)技术的DCT快速算法及其优越性能。
关键词 INTEL 体系结构 IAMMX技术 微处理器 离散余弦变
下载PDF
超长指令字(VLIW)技术特点与实现
2
作者 赵信 《计算机工程与应用》 CSCD 北大核心 1992年第1期18-22,共5页
本文阐明了超长指令机的设计技术并结合实例TRACE机具体分析了VLIW的特点,对VLIW技术的现状与发展做了介绍。
关键词 超长指令字 微处理器 RISC
下载PDF
RISC体系结构中寄存器文件结构与管理机制研究
3
作者 程东年 《计算机工程与应用》 CSCD 北大核心 1990年第1期8-16,共9页
RISC技术虽然使在微处理器芯片中集成为数众多的寄存器成为可能与现实,但能否有效使用庞大的寄存器文件却又直接影响着RISC优势的发挥,本文讨论了CPU寄存器文件的各种结构和几种实现方案,从高效利用寄存器和减小CPU与主存之间的数据传... RISC技术虽然使在微处理器芯片中集成为数众多的寄存器成为可能与现实,但能否有效使用庞大的寄存器文件却又直接影响着RISC优势的发挥,本文讨论了CPU寄存器文件的各种结构和几种实现方案,从高效利用寄存器和减小CPU与主存之间的数据传输量的目标出发,研究了提高call/return执行效率的各种寄存器管理策略,并简要比较了各种策略对机器几个典型性能参数的影响。 展开更多
关键词 RISC 体系结构 微处理器 寄存器文件 管理机制
下载PDF
AMD的64位膏赌—x86—64随想
4
作者 CHO 《电脑新时代》 2000年第9期13-16,共4页
关键词 AMD X86-64 微处理器 指令集
下载PDF
80386的指令系统
5
作者 陈幼松 《中国计算机用户》 1990年第1期42-43,共2页
关键词 指令系统 计算机 INTEL80386
下载PDF
微处理器的指令系统结构
6
作者 陈幼松 《中国计算机用户》 1990年第1期32-35,共4页
关键词 微处理器 指令系统 结构
下载PDF
日本TRON计划及TRON CPU指令特点
7
作者 张谋 《计算机世界月刊》 1991年第1期7-10,共4页
关键词 TRON计划 TRON-CPU 指令 微机
原文传递
Register Reallocation for Soft Error Reduction 被引量:1
8
作者 WEN Peng YAN Guochang +1 位作者 LI Xuhui YING Shi 《Wuhan University Journal of Natural Sciences》 CAS 2014年第6期519-525,共7页
Subsequently to the problem of performance and energy overhead, the reliability problem of the system caused by soft error has become a growing concern. Since register file(RF) is the hottest component in processor,... Subsequently to the problem of performance and energy overhead, the reliability problem of the system caused by soft error has become a growing concern. Since register file(RF) is the hottest component in processor, if not well protected, soft errors occurring in it will do harm to the system reliability greatly. In order to reduce soft error occurrence rate of register file, this paper presents a method to reallocate the register based on the fact that different live variables have different contribution to the register file vulnerability(RFV). Our experimental results on benchmarks from MiBench suite indicate that our method can significantly enhance the reliability. 展开更多
关键词 register allocation soft error reliability
原文传递
Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs
9
作者 Yun Liang Shuo Wang 《Journal of Computer Science & Technology》 SCIE EI CSCD 2016年第1期36-49,共14页
The key to high performance for GPU architecture lies in its massive threading capability to drive a large number of cores and enable execution overlapping among threads. However, in reality, the number of threads tha... The key to high performance for GPU architecture lies in its massive threading capability to drive a large number of cores and enable execution overlapping among threads. However, in reality, the number of threads that can simultaneously execute is often limited by the size of the register file on GPUs. The traditional SRAM-based register file takes up so large amount of chip area that it cannot scale to meet the increasing demand of GPU applications. Racetrack memory (RM) is a promising technology for designing large capacity register file on GPUs due to its high data storage density. However, without careful deployment of RM-based register file, the lengthy shift operations of RM may hurt the performance. In this paper, we explore RM for designing high-performance register file for GPU architecture. High storage density RM helps to improve the thread level parallelism (TLP), but if the bits of the registers are not aligned to the ports, shift operations are required to move the bits to the access ports before they are accessed, and thus the read/write operations are delayed. We develop an optimization framework for RM-based register file on GPUs, which employs three different optimization techniques at the application, compilation, and architecture level, respectively. More clearly, we optimize the TLP at the application level, design a register mapping algorithm at the compilation level, and design a preshifting mechanism at the architecture level. Collectively, these optimizations help to determine the TLP without causing cache and register file resource contention and reduce the shift operation overhead. Experimental results using a variety of representative workloads demonstrate that our optimization framework achieves up to 29% (21% on average) performance improvement. 展开更多
关键词 register file racetrack memory GPU
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部