Estimating the cycle time of each job over event streams in intelligent manufacturing is critical. These streams include many long-lasting events which have certain durations. The temporal relationships among those in...Estimating the cycle time of each job over event streams in intelligent manufacturing is critical. These streams include many long-lasting events which have certain durations. The temporal relationships among those interval-based events are often complex. Meanwhile, network latencies and machine failures in intelligent manufacturing may cause events to be out-of-order. This topic has rarely been discussed because most existing methods do not consider both interval-based and out-of-order events. In this work, we analyze the preliminaries of event temporal semantics. A tree-plan model of interval-based out-of-order events is proposed. A hybrid solution is correspondingly introduced. Extensive experimental studies demonstrate the efficiency of our approach.展开更多
Fault tolerance in microprocessor systems has become a popular topic of architecture research. Much work has been done at different levels to accomplish reliability against soft errors, and some fault tolerance archit...Fault tolerance in microprocessor systems has become a popular topic of architecture research. Much work has been done at different levels to accomplish reliability against soft errors, and some fault tolerance architectures have been proposed. But little attention is paid to the thread level superscalar fault tolerance. This letter introduces microthread concept into superscalar processor fault tolerance domain, and puts forward a novel fault tolerance architecture, namely, MicroThread Based (MTB) coarse grained transient fault tolerance superscalar processor architecture, then discusses some detailed implementations.展开更多
Pixel-parallel PE and SIMD architectures are widely used in high-speed image processing to enhance computing power. With fully exploiting the data level parallelism of low- and middle-level image processing, SIMD arch...Pixel-parallel PE and SIMD architectures are widely used in high-speed image processing to enhance computing power. With fully exploiting the data level parallelism of low- and middle-level image processing, SIMD architecture is able to finish great amount of computation with much less instruction cycle thus satisfy the high-speed system requirement. The main computation parts in those SIMD image processing hardware is known as PE (processing element) and it is responsible for transferring, storing and processing the image data. This paper describes a high-speed vision system with superscalar PE to enhance system performance and its dedicated parallel computing language specifically devel-oped for this vision system. The vision system can achieve motion detection at more than 2000fps and face detection at more than 100 fps which overwhelms some general serial CPUs in the same applications.展开更多
Under the direction of design space theory,in this paper we discuss the design of a superscalar pipelining using the way of multiple issues,and the implement of a superscalar based RISC DSP architecture,SDSP.Furthermo...Under the direction of design space theory,in this paper we discuss the design of a superscalar pipelining using the way of multiple issues,and the implement of a superscalar based RISC DSP architecture,SDSP.Furthermore,in this paper we discuss the validity of instruction prefetch,the branch prediction,the depth of instruction window and other issues that can affect the performance of superscalar DSP.展开更多
数字信号处理器(Digital Signal Processor,DSP)是一种用于数字信号处理的专用微处理器,在通信、自动化、雷达、航空航天等领域具有重要应用价值.本文系统阐述了DSP体系结构的发展过程和现状,介绍了主要生产厂商的DSP产品及其性能;总结...数字信号处理器(Digital Signal Processor,DSP)是一种用于数字信号处理的专用微处理器,在通信、自动化、雷达、航空航天等领域具有重要应用价值.本文系统阐述了DSP体系结构的发展过程和现状,介绍了主要生产厂商的DSP产品及其性能;总结了DSP芯片的主要结构特点;分析了现有DSP体系结构设计中提升数据级和指令级并行性的主要技术,包括哈佛结构、硬件乘法器、SIMD、VLIW和超标量等.结合新时代DSP应用需求,本文提出了DSP体系结构研究的三个发展方向:(1)通过增加数据和指令并行性,向超高性能DSP发展,提升矢量、标量并行能力,支持张量计算,集成面向神经网络算子的专用控制通路和功能单元,提升AI计算处理能力;(2)从指令系统入手,将变长指令集与超标量技术结合,在实现指令并行的同时,结合可适应神经网络算法扩展的计算流控制指令,提升AI算法映射能力,同时降低代码密度,减小存储压力和取指带宽,降低成本,提升边缘智能实时处理应用能力;(3)兼容面向稀疏神经网络的压缩和并发访问的分布式存储结构,提升边缘智能片上部署能力和网络层多通道并行计算能力.展开更多
文摘Estimating the cycle time of each job over event streams in intelligent manufacturing is critical. These streams include many long-lasting events which have certain durations. The temporal relationships among those interval-based events are often complex. Meanwhile, network latencies and machine failures in intelligent manufacturing may cause events to be out-of-order. This topic has rarely been discussed because most existing methods do not consider both interval-based and out-of-order events. In this work, we analyze the preliminaries of event temporal semantics. A tree-plan model of interval-based out-of-order events is proposed. A hybrid solution is correspondingly introduced. Extensive experimental studies demonstrate the efficiency of our approach.
文摘Fault tolerance in microprocessor systems has become a popular topic of architecture research. Much work has been done at different levels to accomplish reliability against soft errors, and some fault tolerance architectures have been proposed. But little attention is paid to the thread level superscalar fault tolerance. This letter introduces microthread concept into superscalar processor fault tolerance domain, and puts forward a novel fault tolerance architecture, namely, MicroThread Based (MTB) coarse grained transient fault tolerance superscalar processor architecture, then discusses some detailed implementations.
文摘Pixel-parallel PE and SIMD architectures are widely used in high-speed image processing to enhance computing power. With fully exploiting the data level parallelism of low- and middle-level image processing, SIMD architecture is able to finish great amount of computation with much less instruction cycle thus satisfy the high-speed system requirement. The main computation parts in those SIMD image processing hardware is known as PE (processing element) and it is responsible for transferring, storing and processing the image data. This paper describes a high-speed vision system with superscalar PE to enhance system performance and its dedicated parallel computing language specifically devel-oped for this vision system. The vision system can achieve motion detection at more than 2000fps and face detection at more than 100 fps which overwhelms some general serial CPUs in the same applications.
文摘Under the direction of design space theory,in this paper we discuss the design of a superscalar pipelining using the way of multiple issues,and the implement of a superscalar based RISC DSP architecture,SDSP.Furthermore,in this paper we discuss the validity of instruction prefetch,the branch prediction,the depth of instruction window and other issues that can affect the performance of superscalar DSP.
文摘数字信号处理器(Digital Signal Processor,DSP)是一种用于数字信号处理的专用微处理器,在通信、自动化、雷达、航空航天等领域具有重要应用价值.本文系统阐述了DSP体系结构的发展过程和现状,介绍了主要生产厂商的DSP产品及其性能;总结了DSP芯片的主要结构特点;分析了现有DSP体系结构设计中提升数据级和指令级并行性的主要技术,包括哈佛结构、硬件乘法器、SIMD、VLIW和超标量等.结合新时代DSP应用需求,本文提出了DSP体系结构研究的三个发展方向:(1)通过增加数据和指令并行性,向超高性能DSP发展,提升矢量、标量并行能力,支持张量计算,集成面向神经网络算子的专用控制通路和功能单元,提升AI计算处理能力;(2)从指令系统入手,将变长指令集与超标量技术结合,在实现指令并行的同时,结合可适应神经网络算法扩展的计算流控制指令,提升AI算法映射能力,同时降低代码密度,减小存储压力和取指带宽,降低成本,提升边缘智能实时处理应用能力;(3)兼容面向稀疏神经网络的压缩和并发访问的分布式存储结构,提升边缘智能片上部署能力和网络层多通道并行计算能力.