With the coming of exascale supercomputing era, power efficiency has become the most important obstacle to build an exascale system. Dataflow architecture has native advantage in achieving high power efficiency for sc...With the coming of exascale supercomputing era, power efficiency has become the most important obstacle to build an exascale system. Dataflow architecture has native advantage in achieving high power efficiency for scientific applications. However, the state-of-the-art dataflow architectures fail to exploit high parallelism for loop processing. To address this issue, we propose a pipelining loop optimization method (PLO), which makes iterations in loops flow in the processing element (PE) array of dataflow accelerator. This method consists of two techniques, architecture-assisted hardware iteration and instruction-assisted software iteration. In hardware iteration execution model, an on-chip loop controller is designed to generate loop indexes, reducing the complexity of computing kernel and laying a good f(mndation for pipelining execution. In software iteration execution model, additional loop instructions are presented to solve the iteration dependency problem. Via these two techniques, the average number of instructions ready to execute per cycle is increased to keep floating-point unit busy. Simulation results show that our proposed method outperforms static and dynamic loop execution model in floating-point efficiency by 2.45x and 1.1x on average, respectively, while the hardware cost of these two techniques is acceptable.展开更多
A distibuted optimal local double loop(DOLDL) network is presented. Emphasis is laid on the topology and distributed routing algorithms for the DOLDL. On the basis of building an abstract model, a set of definitions a...A distibuted optimal local double loop(DOLDL) network is presented. Emphasis is laid on the topology and distributed routing algorithms for the DOLDL. On the basis of building an abstract model, a set of definitions and theorems are described and proved. An algorithm which can optimize the double loop networks is presented. The optimal values of the topologic parameters for the DOLDL have been obtained by the algorithm, and these numerical results are analyzed. The study shows that the bounds of the optimal diameter (d) and average hop distance (a) for this class of networks are [square-root 3N -2] less-than-or-equal-to d less-than-or-equal-to [square-root 3N+1] and (5N/9(N-1)) (square-root 3N-1.8) < a < (5N/9 (N-1)). (square-root 3N - 0.23), respectively (N is the number of nodes in the network. (3 less-than-or-equal-to N less-than-or-equal-to 10(4)). A class of the distributed routing algorithms for the DOLDL and the implementation procedure of an adaptive fault-tolerant algorithm are proposed. The correctness of the algorithm has been also verified by simulating.展开更多
A contour-parallel offset (CPO) tool-path linking algorithm is derived without toolretractions and with the largest practicability. The concept of "tool-path loop tree" (TPL-tree) providing the information on th...A contour-parallel offset (CPO) tool-path linking algorithm is derived without toolretractions and with the largest practicability. The concept of "tool-path loop tree" (TPL-tree) providing the information on the parent/child relationships among the tool-path loops (TPLs) is presented. The direction, tool-path loop, leaf/branch, layer number, and the corresponding points of the TPL-tree are introduced. By defining TPL as a vector, and by traveling throughout the tree, a CPO tool-path without tool-retractions can be derived.展开更多
To improve the energy efficiency of a direct expansion air conditioning(DX A/C) system while guaranteeing occupancy comfort, a hierarchical controller for a DX A/C system with uncertain parameters is proposed. The con...To improve the energy efficiency of a direct expansion air conditioning(DX A/C) system while guaranteeing occupancy comfort, a hierarchical controller for a DX A/C system with uncertain parameters is proposed. The control strategy consists of an open loop optimization controller and a closed-loop guaranteed cost periodically intermittent-switch controller(GCPISC). The error dynamics system of the closed-loop control is modelled based on the GCPISC principle. The difference,compared to the previous DX A/C system control methods, is that the controller designed in this paper performs control at discrete times. For the ease of designing the controller, a series of matrix inequalities are derived to be the sufficient conditions of the lower-layer closed-loop GCPISC controller. In this way, the DX A/C system output is derived to follow the optimal references obtained through the upper-layer open loop controller in exponential time, and the energy efficiency of the system is improved. Moreover, a static optimization problem is addressed for obtaining an optimal GCPISC law to ensure a minimum upper bound on the DX A/C system performance considering energy efficiency and output tracking error. The advantages of the designed hierarchical controller for a DX A/C system with uncertain parameters are demonstrated through some simulation results.展开更多
This paper describes a low-noise phase-locked loop (PLL) design method to achieve minimum jitter. Based on the phase noise properties extracted from the transistor, and the low-pass or high-pass transfer characteris...This paper describes a low-noise phase-locked loop (PLL) design method to achieve minimum jitter. Based on the phase noise properties extracted from the transistor, and the low-pass or high-pass transfer characteristics of different noise sources to the output, an optimal loop bandwidth design method, derived from a continuous-time PLL model, further improves the jitter characteristics of the PLL. The described method not only finds the optimal loop-bandwidth to minimize the overall PLL jitter, but also achieves optimal loop-bandwidth by changing the value of the resistor or charge pump current. In addition, a phase-domain behavioral model in ADS is presented for accurately predicting improved jitter performance of a PLL at system level. A prototype PLL designed in a 0.18 μm CMOS technology is used to investigate the accuracy of the theoretical predictions. The simulation shows significant performance improvement by using the proposed method. The simulated RMS and peak-to-peak jitter of the PLL at the optimal loop-bandwidth are 10.262 ps and 46.851 ps, respectively.展开更多
基金This work was supported by the National Key Research and Development Program of China under Grant No. 2016YFB0200501, tile National Natural Science Foundation of China under Grant Nos. 61332009 and 61521092, the Open Project Program of State Key Laboratory of Mathematical Engineering and Advanced Computing under Grant No. 2016A04 and tile Beijing Municipal Science and Technology Commission under Grant No. Z15010101009, the Open Project Program of State Key Laboratory of Computer Architecture under Grant No. CARCH201503, China Scholarship Council, and Beijing Advanced hmovation Center for hnaging Technology.
文摘With the coming of exascale supercomputing era, power efficiency has become the most important obstacle to build an exascale system. Dataflow architecture has native advantage in achieving high power efficiency for scientific applications. However, the state-of-the-art dataflow architectures fail to exploit high parallelism for loop processing. To address this issue, we propose a pipelining loop optimization method (PLO), which makes iterations in loops flow in the processing element (PE) array of dataflow accelerator. This method consists of two techniques, architecture-assisted hardware iteration and instruction-assisted software iteration. In hardware iteration execution model, an on-chip loop controller is designed to generate loop indexes, reducing the complexity of computing kernel and laying a good f(mndation for pipelining execution. In software iteration execution model, additional loop instructions are presented to solve the iteration dependency problem. Via these two techniques, the average number of instructions ready to execute per cycle is increased to keep floating-point unit busy. Simulation results show that our proposed method outperforms static and dynamic loop execution model in floating-point efficiency by 2.45x and 1.1x on average, respectively, while the hardware cost of these two techniques is acceptable.
文摘A distibuted optimal local double loop(DOLDL) network is presented. Emphasis is laid on the topology and distributed routing algorithms for the DOLDL. On the basis of building an abstract model, a set of definitions and theorems are described and proved. An algorithm which can optimize the double loop networks is presented. The optimal values of the topologic parameters for the DOLDL have been obtained by the algorithm, and these numerical results are analyzed. The study shows that the bounds of the optimal diameter (d) and average hop distance (a) for this class of networks are [square-root 3N -2] less-than-or-equal-to d less-than-or-equal-to [square-root 3N+1] and (5N/9(N-1)) (square-root 3N-1.8) < a < (5N/9 (N-1)). (square-root 3N - 0.23), respectively (N is the number of nodes in the network. (3 less-than-or-equal-to N less-than-or-equal-to 10(4)). A class of the distributed routing algorithms for the DOLDL and the implementation procedure of an adaptive fault-tolerant algorithm are proposed. The correctness of the algorithm has been also verified by simulating.
文摘A contour-parallel offset (CPO) tool-path linking algorithm is derived without toolretractions and with the largest practicability. The concept of "tool-path loop tree" (TPL-tree) providing the information on the parent/child relationships among the tool-path loops (TPLs) is presented. The direction, tool-path loop, leaf/branch, layer number, and the corresponding points of the TPL-tree are introduced. By defining TPL as a vector, and by traveling throughout the tree, a CPO tool-path without tool-retractions can be derived.
基金supported by the National Natural Science Foundation of China(61773220,61876192,61907021)the National Natural Science Foundation of Hubei(ZRMS2019000752)+2 种基金the Fundamental Research Funds for the Central Universities(2662018QD057,CZT20022,CZT20020)Academic Team in Universities(KTZ20051)School Talent Funds(YZZ19004)。
文摘To improve the energy efficiency of a direct expansion air conditioning(DX A/C) system while guaranteeing occupancy comfort, a hierarchical controller for a DX A/C system with uncertain parameters is proposed. The control strategy consists of an open loop optimization controller and a closed-loop guaranteed cost periodically intermittent-switch controller(GCPISC). The error dynamics system of the closed-loop control is modelled based on the GCPISC principle. The difference,compared to the previous DX A/C system control methods, is that the controller designed in this paper performs control at discrete times. For the ease of designing the controller, a series of matrix inequalities are derived to be the sufficient conditions of the lower-layer closed-loop GCPISC controller. In this way, the DX A/C system output is derived to follow the optimal references obtained through the upper-layer open loop controller in exponential time, and the energy efficiency of the system is improved. Moreover, a static optimization problem is addressed for obtaining an optimal GCPISC law to ensure a minimum upper bound on the DX A/C system performance considering energy efficiency and output tracking error. The advantages of the designed hierarchical controller for a DX A/C system with uncertain parameters are demonstrated through some simulation results.
基金supported by the National Natural Science Foundation of China(No.60873212)
文摘This paper describes a low-noise phase-locked loop (PLL) design method to achieve minimum jitter. Based on the phase noise properties extracted from the transistor, and the low-pass or high-pass transfer characteristics of different noise sources to the output, an optimal loop bandwidth design method, derived from a continuous-time PLL model, further improves the jitter characteristics of the PLL. The described method not only finds the optimal loop-bandwidth to minimize the overall PLL jitter, but also achieves optimal loop-bandwidth by changing the value of the resistor or charge pump current. In addition, a phase-domain behavioral model in ADS is presented for accurately predicting improved jitter performance of a PLL at system level. A prototype PLL designed in a 0.18 μm CMOS technology is used to investigate the accuracy of the theoretical predictions. The simulation shows significant performance improvement by using the proposed method. The simulated RMS and peak-to-peak jitter of the PLL at the optimal loop-bandwidth are 10.262 ps and 46.851 ps, respectively.