期刊文献+
共找到370篇文章
< 1 2 19 >
每页显示 20 50 100
THE DESIGN AND IMPLEMENTATION OF THE IEEE 802.11 MAC BASED ON SOFT-CORE PROCESSOR AND RTOS 被引量:1
1
作者 Xiao Wan'ang Fang Zhi Shi Yin 《Journal of Electronics(China)》 2007年第2期232-237,共6页
The implementation method of the IEEE 802.11 Medium Access Control (MAC) protocol is mainly based on DSP (Digital Signal Processor)/ ARM (Advanced Reduced instruction set computer Machine) processor or DSP/ARM IP (Int... The implementation method of the IEEE 802.11 Medium Access Control (MAC) protocol is mainly based on DSP (Digital Signal Processor)/ ARM (Advanced Reduced instruction set computer Machine) processor or DSP/ARM IP (Intellectual Property) core. This paper presents a method based on Nios II soft-core processor embedded in Altera’s Cyclone FPGA (Field Programmable Gate Array) and MicroC/OS-II RTOS (Real-Time Operation System). The benefits and drawbacks of above methods are compared, and then the method presented in this paper is described. The hardware and software partitioning are discussed; the hardware architecture is also illustrated and the MAC software programming is described in detail. The presented method has some advantages, such as low cost, easy-implementation and very suitable for the implementation of IEEE 802.11 MAC in research stage. 展开更多
关键词 IEEE 802.11 Medium Access Control (MAC) design and implementation Real-Time Operation System (RTOS) Soft-core processor
下载PDF
Research and Design of Reconfigurable Matrix Multiplication over Finite Field in VLIW Processor
2
作者 Yang Su Xiaoyuan Yang Yuechuan Wei 《China Communications》 SCIE CSCD 2016年第10期222-232,共11页
Matrix multiplication plays a pivotal role in the symmetric cipher algorithms, but it is one of the most complex and time consuming units, its performance directly affects the efficiency of cipher algorithms. Combined... Matrix multiplication plays a pivotal role in the symmetric cipher algorithms, but it is one of the most complex and time consuming units, its performance directly affects the efficiency of cipher algorithms. Combined with the characteristics of VLIW processor and matrix multiplication of symmetric cipher algorithms, this paper extracted the reconfigurable elements and analyzed the principle of matrix multiplication, then designed the reconfigurable architecture of matrix multiplication of VLIW processor further, at last we put forward single instructions for matrix multiplication between 4×1 and 4×4 matrix or two 4×4 matrix over GF(2~8), through the instructions extension, the instructions could support larger dimension operations. The experiment shows that the instructions we designed supports different dimensions matrix multiplication and improves the processing speed of multiplication greatly. 展开更多
关键词 CRYPTOGRAPHY reconfigurable matrix multiplication research and design dedicated instruction VLIW processor
下载PDF
Architecture-level performance/power tradeoff in network processor design
3
作者 陈红松 季振洲 胡铭曾 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2007年第1期45-48,共4页
Network processors are used in the core node of network to flexibly process packet streams. With the increase of performance, the power of network processor increases fast, and power and cooling become a bottleneck. A... Network processors are used in the core node of network to flexibly process packet streams. With the increase of performance, the power of network processor increases fast, and power and cooling become a bottleneck. Architecture-level power conscious design must go beyond low-level circuit design. Architectural power and performance tradeoff should be considered at the same time. Simulation is an efficient method to design modem network processor before making chip. In order to achieve the tradeoff between performance and power, the processor simulator is used to design the architecture of network processor. Using Netbeneh, Commubench benchmark and processor simulator-SimpleScalar, the performance and power of network processor are quantitatively evaluated. New performance tradeoff evaluation metric is proposed to analyze the architecture of network processor. Based on the high performance lnteI IXP 2800 Network processor eonfignration, optimized instruction fetch width and speed ,instruction issue width, instruction window size are analyzed and selected. Simulation resuits show that the tradeoff design method makes the usage of network processor more effectively. The optimal key parameters of network processor are important in architecture-level design. It is meaningful for the next generation network processor design. 展开更多
关键词 network processor design performance/power simulation tradeoff evaluation optimization
下载PDF
Parallel Processing Design for LTE PUSCH Demodulation and Decoding Based on Multi-Core Processor
4
作者 Zhang Ziran,Li Jun,Li Changxiao(ZTE Corporation,Shenzhen 518057,P.R.China) 《ZTE Communications》 2009年第1期54-58,共5页
The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Co... The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Consequently,the single-core processor cannot meet the requirements of LTE system.This paper analyzes how to use multi-core processors to achieve parallel processing of uplink demodulation and decoding in LTE systems and designs an approach to parallel processing.The test results prove that this approach works quite well. 展开更多
关键词 CORE LTE Parallel Processing design for LTE PUSCH Demodulation and Decoding Based on Multi-Core processor design
下载PDF
Multiple Levels of Abstraction in the Simulation of Microthreaded Many-Core Architectures
5
作者 Irfan Uddin 《Open Journal of Modelling and Simulation》 2015年第4期159-190,共32页
Simulators are generally used during the design of computer architectures. Typically, different simulators with different levels of complexity, speed and accuracy are used. However, for early design space exploration,... Simulators are generally used during the design of computer architectures. Typically, different simulators with different levels of complexity, speed and accuracy are used. However, for early design space exploration, simulators with less complexity, high simulation speed and reasonable accuracy are desired. It is also required that these simulators have a short development time and that changes in the design require less effort in the implementation in order to perform experiments and see the effects of changes in the design. These simulators are termed high-level simulators in the context of computer architecture. In this paper, we present multiple levels of abstractions in a high-level simulation of a general-purpose many-core system, where the objective of every level is to improve the accuracy in simulation without significantly affecting the complexity and simulation speed. 展开更多
关键词 HIGH-LEVEL Simulations MULTIPLE LEVELS of ABSTRACTION design Space Exploration many-core Systems
下载PDF
Experimentation of a 1-pixel bit reconfigurable ternary optical processor 被引量:1
6
作者 王宏健 金翊 +1 位作者 欧阳山 周裕 《Journal of Shanghai University(English Edition)》 CAS 2011年第5期430-436,共7页
A detailed experiment of 1-pixel bit reconfigurable ternary optical processor (TOP) is proposed in the paper. 42 basic operation units (BOUs) and 28 typical logic operators of the TOP are realized in the experimen... A detailed experiment of 1-pixel bit reconfigurable ternary optical processor (TOP) is proposed in the paper. 42 basic operation units (BOUs) and 28 typical logic operators of the TOP are realized in the experiment. Results of the test cases elaborately cover the every combination of BOUs and all the nine inputs of ternary processor. Both the experiment process and results analysis are given in this paper. The experimental results demonstrate that the theory of reconfiguring a TOP is valid and that the reconfiguration circuitry is effective. 展开更多
关键词 ternary optical processor (TOP) decrease-radix design basic operation units (BOUs) RECONFIGURABILITY recon figuration circuitry
下载PDF
Design of a Low Power DSP with Distributed and Early Clock Gating 被引量:1
7
作者 王兵 王琴 +1 位作者 彭瑞华 付宇卓 《Journal of Shanghai Jiaotong university(Science)》 EI 2007年第5期610-617,共8页
A novel clock structure of a low-power 16-bit very large instruction word (VLIW) digital signal processor (DSP) was proposed. To improve deterministic clock gating and to solve the drawback of conventional clock gatin... A novel clock structure of a low-power 16-bit very large instruction word (VLIW) digital signal processor (DSP) was proposed. To improve deterministic clock gating and to solve the drawback of conventional clock gating circuit in high speed circuit, a distributed and early clock gating method was developed on its instruction fetch & decoder unit, its pipelined data-path unit and its super-Harvard memory interface unit. The core was implemented following the Synopsys back-end flow under TSMC (Taiwan Silicon manufacture corporation) 0.18-μm 1.8-V 1P6M process, with a core size of 2 mm×2 mm. Result shows that it can run under 200 MHz with a power performance around 0.3 mW/MIPS. Meanwhile, only 39.7% circuit is active simultaneously in average, compared to its non-gating counterparts. 展开更多
关键词 digital signal processor (DSP) deterministic clock gating (DCG) distributed and early clock gating low power design pipeline
下载PDF
Dynamic Power Dissipation Control Method for Real-Time Processors Based on Hardware Multithreading
8
作者 罗新强 齐悦 +1 位作者 王磊 王沁 《China Communications》 SCIE CSCD 2013年第5期156-166,共11页
In order to eliminate the energy waste caused by the traditional static hardware multithreaded processor used in real-time embedded system working in the low workload situation, the energy efficiency of the hardware m... In order to eliminate the energy waste caused by the traditional static hardware multithreaded processor used in real-time embedded system working in the low workload situation, the energy efficiency of the hardware multithread is discussed and a novel dynamic multithreaded architecture is proposed. The proposed architecture saves the energy wasted by removing idle threads without manipulation on the original architecture, fulfills a seamless switching mechanism which protects active threads and avoids pipeline stall during power mode switching. The report of an implemented dynamic multithreaded processor with 45 nm process from synthesis tool indicates that the area of dynamic multithreaded architecture is only 2.27% higher than the static one in achieving dynamic power dissipation, and consumes 1.3% more power in the same peak performance. 展开更多
关键词 dynamic power dissipation control real-time processor hardware multithread low power design energy efficiency
下载PDF
A Low Power Non-Volatile LR-WPAN Baseband Processor with Wake-Up Identification Receiver
9
作者 YU Shuangming FENG Peng WU Nanjian 《China Communications》 SCIE CSCD 2016年第1期33-46,共14页
The paper proposes a low power non-volatile baseband processor with wake-up identification(WUI) receiver for LR-WPAN transceiver.It consists of WUI receiver,main receiver,transmitter,non-volatile memory(NVM) and power... The paper proposes a low power non-volatile baseband processor with wake-up identification(WUI) receiver for LR-WPAN transceiver.It consists of WUI receiver,main receiver,transmitter,non-volatile memory(NVM) and power management module.The main receiver adopts a unified simplified synchronization method and channel codec with proactive Reed-Solomon Bypass technique,which increases the robustness and energy efficiency of receiver.The WUI receiver specifies the communication node and wakes up the transceiver to reduce average power consumption of the transceiver.The embedded NVM can backup/restore the states information of processor that avoids the loss of the state information caused by power failure and reduces the unnecessary power of repetitive computation when the processor is waked up from power down mode.The baseband processor is designed and verified on a FPGA board.The simulated power consumption of processor is 5.1uW for transmitting and 28.2μW for receiving.The WUI receiver technique reduces the average power consumption of transceiver remarkably.If the transceiver operates 30 seconds in every 15 minutes,the average power consumption of the transceiver can be reduced by two orders of magnitude.The NVM avoids the loss of the state information caused by power failure and energy waste caused by repetitive computation. 展开更多
关键词 LR-WPAN wake-up identification receiver synchronization non-volatile memory baseband processor digital integrated circuit low power chip design
下载PDF
Design for Low Power Testing of Computation Modules with Contiguous Subspace in VLSI
10
作者 Ji-Xue Xiao Yong-Le Xie Guang-Ju Chen 《Journal of Electronic Science and Technology of China》 2009年第4期326-330,共5页
A kind of pseudo Gray code presentation of test patterns based on accumulation generators is presented and a low power test scheme is proposed to test computational function modules with contiguous subspace in very la... A kind of pseudo Gray code presentation of test patterns based on accumulation generators is presented and a low power test scheme is proposed to test computational function modules with contiguous subspace in very large scale integration (VLSI), especially in digital signal processors (DSP). If test patterns from accumulators for the modules are encoded in the pseudo Gray code presentation, the switching activities of the modules are reduced, and the decrease of the test power consumption is resulted in. Results of experimentation based on FPGA show that the test approach can reduce dynamic power consumption by an average of 17.40% for 8-bit ripple carry adder consisting of 3-2 counters. Then implementation of the low power test in hardware is exploited. Because of the reuse of adders, introduction of additional XOR logic gates is avoided successfully. The design minimizes additional hardware overhead for test and needs no adjustment of circuit structure. The low power test can detect any combinational stuck-at fault within the basic building block without any degradation of original circuit performance. 展开更多
关键词 ADDER design digital signal processors (DSP) low power test.
下载PDF
基于ARM的电阻炉炉温控制系统设计 被引量:1
11
作者 马飞 《工业加热》 CAS 2024年第4期6-8,12,共4页
在科学技术突飞猛进发展的背景下,现代工业生产中的电压、电流、开关量等都是重要的被控参数,在冶金制造业中,温度是器件生产过程中非常重要的物理参数,需要对各种加热炉的温度进行严格控制,对其温度变化进行实时监测,确保炉内温度满足... 在科学技术突飞猛进发展的背景下,现代工业生产中的电压、电流、开关量等都是重要的被控参数,在冶金制造业中,温度是器件生产过程中非常重要的物理参数,需要对各种加热炉的温度进行严格控制,对其温度变化进行实时监测,确保炉内温度满足制造器件的需求。电阻炉在金属热处理中具有较为广泛的应用,是进行金属锻压加热、烧结的重要工业设备。电阻炉温度控制多采用自动化控制系统,实现智能化管理,保证炉温的均匀度以及零件温度的均匀性,提高生产的可靠性和稳定性。从电阻炉温度控制的难点入手分析,结合电阻炉温度控制系统的设计原则,提出一种基于ARM处理器的电阻炉炉温控制系统设计方案,能够提高电阻炉温度控制的精度,保证工业生产的稳定性。 展开更多
关键词 电阻炉 温度控制 ARM处理器 系统设计
下载PDF
Research on Superscalar Digital Signal Processor
12
作者 DengZhenghong ZhengWei DengLei HuZhengguo 《医学信息(医学与计算机应用)》 2004年第2期64-67,共4页
Under the direction of design space theory,in this paper we discuss the design of a superscalar pipelining using the way of multiple issues,and the implement of a superscalar based RISC DSP architecture,SDSP.Furthermo... Under the direction of design space theory,in this paper we discuss the design of a superscalar pipelining using the way of multiple issues,and the implement of a superscalar based RISC DSP architecture,SDSP.Furthermore,in this paper we discuss the validity of instruction prefetch,the branch prediction,the depth of instruction window and other issues that can affect the performance of superscalar DSP. 展开更多
关键词 超标量结构数字信号处理器 结构空间理论 流水线作业 数字信号
下载PDF
一种基于哈佛结构的流水线处理机设计
13
作者 刘玉洁 宋淼 《实验室科学》 2024年第5期23-26,32,共5页
面对近年来计算机专业工程教育的导向,急需加强学生CPU及相关技术的设计能力,提出并实现了一个基于哈佛结构的流水线处理机。针对本科实践教学中,指令级流水线设计需求与教学资源欠缺的矛盾问题,该处理机采用哈佛结构及相关技术,减小了... 面对近年来计算机专业工程教育的导向,急需加强学生CPU及相关技术的设计能力,提出并实现了一个基于哈佛结构的流水线处理机。针对本科实践教学中,指令级流水线设计需求与教学资源欠缺的矛盾问题,该处理机采用哈佛结构及相关技术,减小了相关处理和并行控制难度,控制了处理机设计的规模,从而一定程度上解决了教学过程中的冲突问题。该设计协调了多方面的实际情况,对实验环境和设备要求低,且经过小规模的教学实施,取得了良好的效果。 展开更多
关键词 实践教学 指令级流水 哈佛结构 处理机设计
下载PDF
机器学习辅助微架构功耗建模和设计空间探索综述 被引量:1
14
作者 翟建旺 凌梓超 +2 位作者 白晨 赵康 余备 《计算机研究与发展》 EI CSCD 北大核心 2024年第6期1351-1369,共19页
微架构设计是处理器开发的关键阶段,处在整个设计流程的上游,直接影响性能、功耗、成本等核心设计指标.在过去的数十年中,新的微架构设计方案,结合半导体制造工艺的进步,使得新一代处理器能够实现更高的性能和更低的功耗、成本.然而,随... 微架构设计是处理器开发的关键阶段,处在整个设计流程的上游,直接影响性能、功耗、成本等核心设计指标.在过去的数十年中,新的微架构设计方案,结合半导体制造工艺的进步,使得新一代处理器能够实现更高的性能和更低的功耗、成本.然而,随着集成电路发展至“后摩尔时代”,半导体工艺演进所带来的红利愈发有限,功耗问题已成为高能效处理器设计的主要挑战.与此同时,现代处理器的架构愈发复杂、设计空间愈发庞大,设计人员期望进行快速精确的指标权衡以获得更理想的微架构设计.此外,现有的层层分解的设计流程极为漫长耗时,已经难以实现全局能效最优.因此,如何在微架构设计阶段进行精确高效的前瞻性功耗估计和探索优化成为关键问题.为了应对这些挑战,机器学习技术被引入到微架构设计流程中,为处理器的微架构建模和优化提供了高质量方案.首先介绍了处理器的主要设计流程、微架构设计及其面临的挑战,然后阐述了机器学习辅助集成电路设计,重点在于使用机器学习技术辅助微架构功耗建模和设计空间探索的研究进展,最后进行总结展望. 展开更多
关键词 处理器设计自动化 微架构设计 功耗建模 设计空间探索 机器学习
下载PDF
Cooperative Computing Techniques for a Deeply Fused and Heterogeneous Many-Core Processor Architecture 被引量:13
15
作者 郑方 李宏亮 +3 位作者 吕晖 过锋 许晓红 谢向辉 《Journal of Computer Science & Technology》 SCIE EI CSCD 2015年第1期145-162,共18页
Due to advances in semiconductor techniques, many-core processors have been widely used in high performance computing. However, many applications still cannot be carried out efficiently due to the memory wall, which h... Due to advances in semiconductor techniques, many-core processors have been widely used in high performance computing. However, many applications still cannot be carried out efficiently due to the memory wall, which has become a bottleneck in many-core processors. In this paper, we present a novel heterogeneous many-core processor architecture named deeply fused many-core (DFMC) for high performance computing systems. DFMC integrates management processing ele- ments (MPEs) and computing processing elements (CPEs), which are heterogeneous processor cores for different application features with a unified ISA (instruction set architecture), a unified execution model, and share-memory that supports cache coherence. The DFMC processor can alleviate the memory wall problem by combining a series of cooperative computing techniques of CPEs, such as multi-pattern data stream transfer, efficient register-level communication mechanism, and fast hardware synchronization technique. These techniques are able to improve on-chip data reuse and optimize memory access performance. This paper illustrates an implementation of a full system prototype based on FPGA with four MPEs and 256 CPEs. Our experimental results show that the effect of the cooperative computing techniques of CPEs is significant, with DGEMM (double-precision matrix multiplication) achieving an efficiency of 94%, FFT (fast Fourier transform) obtaining a performance of 207 GFLOPS and FDTD (finite-difference time-domain) obtaining a performance of 27 GFLOPS. 展开更多
关键词 heterogeneous many-core processor data stream transfer register-level communication mechanism hardwaresynchronization technique processor prototype
原文传递
Fault Tolerance Mechanism in Chip Many-Core Processors 被引量:1
16
作者 张磊 韩银和 +1 位作者 李华伟 李晓维 《Tsinghua Science and Technology》 SCIE EI CAS 2007年第S1期169-174,共6页
As semiconductor technology advances, there will be billions of transistors on a single chip. Chip many-core processors are emerging to take advantage of these greater transistor densities to deliver greater performan... As semiconductor technology advances, there will be billions of transistors on a single chip. Chip many-core processors are emerging to take advantage of these greater transistor densities to deliver greater performance. Effective fault tolerance techniques are essential to improve the yield of such complex chips. In this paper, a core-level redundancy scheme called N+M is proposed to improve N-core processors’ yield by providing M spare cores. In such architecture, topology is an important factor because it greatly affects the processors’ performance. The concept of logical topology and a topology reconfiguration problem are introduced, which is able to transparently provide target topology with lowest performance degradation as the presence of faulty cores on-chip. A row rippling and column stealing (RRCS) algorithm is also proposed. Results show that PRCS can give solutions with average 13.8% degradation with negligible computing time. 展开更多
关键词 chip many-core processors YIELD fault tolerance RECONFIGURATION NETWORK-ON-CHIP
原文传递
高性能数字信号处理器的集成电路设计与优化
17
作者 周凯迪 王翠萍 +1 位作者 李迎侠 贾玉凤 《集成电路应用》 2024年第6期48-49,共2页
阐述高性能数字信号处理器(DSP)的集成电路设计与优化,分析现有的设计流程、核心设计和优化技术,以及前端和后端电路的设计与布局优化方法,以提高DSP的性能和效率,降低功耗和成本。
关键词 集成电路设计 数字信号处理器 性能优化
下载PDF
基于高密度计算的多核处理器电力芯片低功耗设计系统
18
作者 匡晓云 黄开天 杨祎巍 《电子设计工程》 2024年第7期6-9,15,共5页
多核处理器电力芯片是目前多种系统的重要组成部分,设计低功耗电力芯片,能够更好地保证系统正常运行。目前设计的电力芯片低功耗系统运行速度较慢,功耗难以达到用户要求,为此该文应用高密度计算设计了一种多核处理器电力芯片低功耗系统... 多核处理器电力芯片是目前多种系统的重要组成部分,设计低功耗电力芯片,能够更好地保证系统正常运行。目前设计的电力芯片低功耗系统运行速度较慢,功耗难以达到用户要求,为此该文应用高密度计算设计了一种多核处理器电力芯片低功耗系统。兼容系统多核处理器与层次化AHB总线,探索处理器电力芯片的整体结构,集中处理存储数据信息,不断调整系统算法参数,通过高密度分析引入矩阵进行数据解析,确保运行过程的安全性。在分析处理器调度性能的基础上,利用高密度处理对数据进行层次化处理,避免数据冗余造成的系统运行故障。实验结果表明,引入所设计系统后电力芯片功耗减少了60%,加速比达到3.992,可以有效提高电力芯片运行性能。 展开更多
关键词 高密度计算 多核处理器 电力芯片 低功耗设计 存储数据
下载PDF
Parallelization and sustainability of distributed genetic algorithms on many-core processors
19
作者 Yuji Sato Mikiko Sato 《International Journal of Intelligent Computing and Cybernetics》 EI 2014年第1期2-23,共22页
Purpose–The purpose of this paper is to propose a fault-tolerant technology for increasing the durability of application programs when evolutionary computation is performed by fast parallel processing on many-core pr... Purpose–The purpose of this paper is to propose a fault-tolerant technology for increasing the durability of application programs when evolutionary computation is performed by fast parallel processing on many-core processors such as graphics processing units(GPUs)and multi-core processors(MCPs).Design/methodology/approach–For distributed genetic algorithm(GA)models,the paper proposes a method where an island’s ID number is added to the header of data transferred by this island for use in fault detection.Findings–The paper has shown that the processing time of the proposed idea is practically negligible in applications and also shown that an optimal solution can be obtained even with a single stuck-at fault or a transient fault,and that increasing the number of parallel threads makes the system less susceptible to faults.Originality/value–The study described in this paper is a new approach to increase the sustainability of application program using distributed GA on GPUs and MCPs. 展开更多
关键词 Evolutionary computation Genetic algorithms Fault identification many-core processors PARALLELIZATION
原文传递
基于无线多媒体处理器的无线可视门铃系统
20
作者 俞怡灵 俞建军 《信息化研究》 2024年第5期71-78,共8页
为提高数字式无线可视门铃的系统性能并降低成本,本文采用单一的无线多媒体嵌入式微处理器SN93331或SN93330设计无线可视门铃的室内外分机,每个分机只需采用单个处理器就可完成音视频采集、数字化处理、无线2.4 G自适应跳频控制及嵌入... 为提高数字式无线可视门铃的系统性能并降低成本,本文采用单一的无线多媒体嵌入式微处理器SN93331或SN93330设计无线可视门铃的室内外分机,每个分机只需采用单个处理器就可完成音视频采集、数字化处理、无线2.4 G自适应跳频控制及嵌入式系统的信息控制,是控制系统的核心。硬件电路还包括相应的低功耗设计电路、无线射频功率放大电路、模拟量触控按键输入电路等,软件采用嵌入式单片机C语言开发,包括液晶显示、MPEG-4编码、解码算法程序,无线传输协议等。该数字式无线可视门铃系统设计简洁、性能可靠、低价高质、应用广泛。 展开更多
关键词 无线可视门铃 无线多媒体处理器 嵌入式设计
下载PDF
上一页 1 2 19 下一页 到第
使用帮助 返回顶部