Microprocessor development emphasizes hardware and software co design. Hw/Sw co design is a modern technique aimed at shortening the time to market in designing the real time and embedded systems. Key feature of this ...Microprocessor development emphasizes hardware and software co design. Hw/Sw co design is a modern technique aimed at shortening the time to market in designing the real time and embedded systems. Key feature of this approach is simultaneous development of the program tools and the target processor to match software application. An effective co design flow must therefore support automatic software toolkits generation, without loss of optimizing efficiency. This has resulted in a paradigm shift towards a language based design methodology for microprocessor optimization and exploration. This paper proposes a formal grammar, UNI SPEC, which supports the automatic generation of assemblers, to describe the translation rules from assembly to binary. Based on UNI SPEC, it implements two typical applications, i.e., automatically generating the assembler and the test suites.展开更多
The relativity of instructions of motor control digital signal processor (MCDSP) in the design is analyzed. A method for obtaining a minimum instruction set in plac e of the complete instruction set during generatio...The relativity of instructions of motor control digital signal processor (MCDSP) in the design is analyzed. A method for obtaining a minimum instruction set in plac e of the complete instruction set during generation of testing procedures is giv en in terms of the processor presentation matrix between micro-operators and in structions of MCDSP.展开更多
In order to gain the great performance of ASIP, this paper discusses different aspects of an ASIP instruction set specification like syntax, encoding, constraints as welt as behaviors, and introduces our ADL model bas...In order to gain the great performance of ASIP, this paper discusses different aspects of an ASIP instruction set specification like syntax, encoding, constraints as welt as behaviors, and introduces our ADL model based methodology to check them. The automatic generation of test cases based on our straight-forward instruction representation is shown, and the efficient generation of them with good coverage is shown as well. The verification of the constraint checker, a very important tool for programmer, is performed. Results show that the toolkit can find some errors in previous delivery tools, and the introduced methodology verifies the feasibility of our instruction set specification.展开更多
A new efficient adapting virtual intermediate instruction set,V-IIS,is designed and implemented towards the optimized dynamic binary translator (DBT) system.With the help of this powerful but previously little-studied...A new efficient adapting virtual intermediate instruction set,V-IIS,is designed and implemented towards the optimized dynamic binary translator (DBT) system.With the help of this powerful but previously little-studied component,DBTs can not only get rid of the dependence of machine(s),but also get better performance.From our systematical study and evaluation,experimental results demonstrate that if V-IIS is well designed,without affecting the other optimizing measures,this could make DBT's performance close to those who do not have intermediate instructions.This study is an important step towards the grand goal of high performance "multi-source" and "multi-target" dynamic binary translation.展开更多
面向RISC-V处理器五级流水线数据通路,设计了基于FPGA的RISC-V指令集子集RV32I的指令译码电路。电路分为主译码电路和程序计数器输入选择(PCSel)译码电路,使用Verilog HDL编程设计,并进行了系列优化:使用时序约束工具分析时序状态,设定...面向RISC-V处理器五级流水线数据通路,设计了基于FPGA的RISC-V指令集子集RV32I的指令译码电路。电路分为主译码电路和程序计数器输入选择(PCSel)译码电路,使用Verilog HDL编程设计,并进行了系列优化:使用时序约束工具分析时序状态,设定约束后对电路进行综合,降低电路延迟;利用无关项化简组合逻辑,减少模块输入输出项,减少电路级联;构建独立的32位串并行数值比较器;插入流水线,提高电路工作频率。电路基于FPGA芯片CycloneⅣEP4CE6F17C6进行设计,使用Quartus Prime 17.1对电路进行仿真,仿真结果表明:在Slow 1200 m V 85℃条件下,指令译码电路达到295.6 MHz的工作频率,相比同类设计具有高速和低资源消耗的特点。展开更多
Computer vision(CV)algorithms have been extensively used for a myriad of applications nowadays.As the multimedia data are generally well-formatted and regular,it is beneficial to leverage the massive parallel processi...Computer vision(CV)algorithms have been extensively used for a myriad of applications nowadays.As the multimedia data are generally well-formatted and regular,it is beneficial to leverage the massive parallel processing power of the underlying platform to improve the performances of CV algorithms.Single Instruction Multiple Data(SIMD)instructions,capable of conducting the same operation on multiple data items in a single instruction,are extensively employed to improve the efficiency of CV algorithms.In this paper,we evaluate the power and effectiveness of RISC-V vector extension(RV-V)on typical CV algorithms,such as Gray Scale,Mean Filter,and Edge Detection.By our examinations,we show that compared with the baseline OpenCV implementation using scalar instructions,the equivalent implementations using the RV-V(version 0.8)can reduce the instruction count of the same CV algorithm up to 24x,when processing the same input images.Whereas,the actual performances improvement measured by the cycle counts is highly related with the specific implementation of the underlying RV-V co-processor.In our evaluation,by using the vector co-processor(with eight execution lanes)of Xuantie C906,vector-version CV algorithms averagely exhibit up to 2.98x performances speedups compared with their scalar counterparts.展开更多
文摘Microprocessor development emphasizes hardware and software co design. Hw/Sw co design is a modern technique aimed at shortening the time to market in designing the real time and embedded systems. Key feature of this approach is simultaneous development of the program tools and the target processor to match software application. An effective co design flow must therefore support automatic software toolkits generation, without loss of optimizing efficiency. This has resulted in a paradigm shift towards a language based design methodology for microprocessor optimization and exploration. This paper proposes a formal grammar, UNI SPEC, which supports the automatic generation of assemblers, to describe the translation rules from assembly to binary. Based on UNI SPEC, it implements two typical applications, i.e., automatically generating the assembler and the test suites.
文摘The relativity of instructions of motor control digital signal processor (MCDSP) in the design is analyzed. A method for obtaining a minimum instruction set in plac e of the complete instruction set during generation of testing procedures is giv en in terms of the processor presentation matrix between micro-operators and in structions of MCDSP.
文摘In order to gain the great performance of ASIP, this paper discusses different aspects of an ASIP instruction set specification like syntax, encoding, constraints as welt as behaviors, and introduces our ADL model based methodology to check them. The automatic generation of test cases based on our straight-forward instruction representation is shown, and the efficient generation of them with good coverage is shown as well. The verification of the constraint checker, a very important tool for programmer, is performed. Results show that the toolkit can find some errors in previous delivery tools, and the introduced methodology verifies the feasibility of our instruction set specification.
基金Projects(12R21414600)supported by Shanghai Municipal Science and Technology Commission,China
文摘A new efficient adapting virtual intermediate instruction set,V-IIS,is designed and implemented towards the optimized dynamic binary translator (DBT) system.With the help of this powerful but previously little-studied component,DBTs can not only get rid of the dependence of machine(s),but also get better performance.From our systematical study and evaluation,experimental results demonstrate that if V-IIS is well designed,without affecting the other optimizing measures,this could make DBT's performance close to those who do not have intermediate instructions.This study is an important step towards the grand goal of high performance "multi-source" and "multi-target" dynamic binary translation.
文摘面向RISC-V处理器五级流水线数据通路,设计了基于FPGA的RISC-V指令集子集RV32I的指令译码电路。电路分为主译码电路和程序计数器输入选择(PCSel)译码电路,使用Verilog HDL编程设计,并进行了系列优化:使用时序约束工具分析时序状态,设定约束后对电路进行综合,降低电路延迟;利用无关项化简组合逻辑,减少模块输入输出项,减少电路级联;构建独立的32位串并行数值比较器;插入流水线,提高电路工作频率。电路基于FPGA芯片CycloneⅣEP4CE6F17C6进行设计,使用Quartus Prime 17.1对电路进行仿真,仿真结果表明:在Slow 1200 m V 85℃条件下,指令译码电路达到295.6 MHz的工作频率,相比同类设计具有高速和低资源消耗的特点。
基金supported by the National Natural Science Foundation of China under Grant No.61972444。
文摘Computer vision(CV)algorithms have been extensively used for a myriad of applications nowadays.As the multimedia data are generally well-formatted and regular,it is beneficial to leverage the massive parallel processing power of the underlying platform to improve the performances of CV algorithms.Single Instruction Multiple Data(SIMD)instructions,capable of conducting the same operation on multiple data items in a single instruction,are extensively employed to improve the efficiency of CV algorithms.In this paper,we evaluate the power and effectiveness of RISC-V vector extension(RV-V)on typical CV algorithms,such as Gray Scale,Mean Filter,and Edge Detection.By our examinations,we show that compared with the baseline OpenCV implementation using scalar instructions,the equivalent implementations using the RV-V(version 0.8)can reduce the instruction count of the same CV algorithm up to 24x,when processing the same input images.Whereas,the actual performances improvement measured by the cycle counts is highly related with the specific implementation of the underlying RV-V co-processor.In our evaluation,by using the vector co-processor(with eight execution lanes)of Xuantie C906,vector-version CV algorithms averagely exhibit up to 2.98x performances speedups compared with their scalar counterparts.