Recent architectures of multi-core systems may have a relatively large number of cores that typically ranges from tens to hundreds;therefore called many-core systems.Such systems require an efficient interconnection n...Recent architectures of multi-core systems may have a relatively large number of cores that typically ranges from tens to hundreds;therefore called many-core systems.Such systems require an efficient interconnection network that tries to address two major problems.First,the overhead of power and area cost and its effect on scalability.Second,high access latency is caused by multiple cores’simultaneous accesses of the same shared module.This paper presents an interconnection scheme called N-conjugate Shuffle Clusters(NCSC)based on multi-core multicluster architecture to reduce the overhead of the just mentioned problems.NCSC eliminated the need for router devices and their complexity and hence reduced the power and area costs.It also resigned and distributed the shared caches across the interconnection network to increase the ability for simultaneous access and hence reduce the access latency.For intra-cluster communication,Multi-port Content Addressable Memory(MPCAM)is used.The experimental results using four clusters and four cores each indicated that the average access latency for a write process is 1.14785±0.04532 ns which is nearly equal to the latency of a write operation in MPCAM.Moreover,it was demonstrated that the average read latency within a cluster is 1.26226±0.090591 ns and around 1.92738±0.139588 ns for read access between cores from different clusters.展开更多
In this paper,a typical experiment is carried out based on a high-resolution air-sea coupled model,namely,the coupled ocean-atmosphere-wave-sediment transport(COAWST)model,on both heterogeneous many-core(SW)and homoge...In this paper,a typical experiment is carried out based on a high-resolution air-sea coupled model,namely,the coupled ocean-atmosphere-wave-sediment transport(COAWST)model,on both heterogeneous many-core(SW)and homogenous multicore(Intel)supercomputing platforms.We construct a hindcast of Typhoon Lekima on both the SW and Intel platforms,compare the simulation results between these two platforms and compare the key elements of the atmospheric and ocean modules to reanalysis data.The comparative experiment in this typhoon case indicates that the domestic many-core computing platform and general cluster yield almost no differences in the simulated typhoon path and intensity,and the differences in surface pressure(PSFC)in the WRF model and sea surface temperature(SST)in the short-range forecast are very small,whereas a major difference can be identified at high latitudes after the first 10 days.Further heat budget analysis verifies that the differences in SST after 10 days are mainly caused by shortwave radiation variations,as influenced by subsequently generated typhoons in the system.These typhoons generated in the hindcast after the first 10 days attain obviously different trajectories between the two platforms.展开更多
With the development of computer technology, network bandwidth and network traffic continue to increase. Considering the large data flow, it is imperative to perform inspection effectively on network packets. In order...With the development of computer technology, network bandwidth and network traffic continue to increase. Considering the large data flow, it is imperative to perform inspection effectively on network packets. In order to find a solution of deep packet inspection which can appropriate to the current network environment, this paper built a deep packet inspection system based on many-core platform, and in this way, verified the feasibility to implement a deep packet inspection system under many-core platform with both high performance and low consumption. After testing and analysis of the system performance, it has been found that the deep packet inspection based on many-core platform TILE_Gx36 [1] [2] can process network traffic of which the bandwidth reaches up to 4 Gbps. To a certain extent, the performance has improved compared to most deep packet inspection system based on X86 platform at present.展开更多
Simulators are generally used during the design of computer architectures. Typically, different simulators with different levels of complexity, speed and accuracy are used. However, for early design space exploration,...Simulators are generally used during the design of computer architectures. Typically, different simulators with different levels of complexity, speed and accuracy are used. However, for early design space exploration, simulators with less complexity, high simulation speed and reasonable accuracy are desired. It is also required that these simulators have a short development time and that changes in the design require less effort in the implementation in order to perform experiments and see the effects of changes in the design. These simulators are termed high-level simulators in the context of computer architecture. In this paper, we present multiple levels of abstractions in a high-level simulation of a general-purpose many-core system, where the objective of every level is to improve the accuracy in simulation without significantly affecting the complexity and simulation speed.展开更多
A model following adaptive control system for CSIM is presented in this paper. A dynamic mathematical model of slip control based system is obtained. With the help of model reducing technique, full order model is ...A model following adaptive control system for CSIM is presented in this paper. A dynamic mathematical model of slip control based system is obtained. With the help of model reducing technique, full order model is reduced to simplify the design without degrading much of the performance. Model following adaptive control laws in discrete form are derived. These laws satisfy the hyperstability condition for taking care of the load and machine parameter changes of the drive. A microprocessor 8098 is used to develop the speed controller. The implementation of the control system uses only available variables of the reference model and the controlled plant. Experimental results are given to demonstrate the good performance of the system.展开更多
A new kind of simple and flexible CO2 welding system was developed to carry out waveform control. The system consisted of IGBT inverter, PWM circuit and microprocessor unit ( MPU) , in which the output current of co...A new kind of simple and flexible CO2 welding system was developed to carry out waveform control. The system consisted of IGBT inverter, PWM circuit and microprocessor unit ( MPU) , in which the output current of constant current (CC) power supply could be changed according to transient physical state, and the variable down slope rate control could be used to ensure a stable welding process. The welding experiment results proved the effectiveness of this control approach.展开更多
Virtual memory management is always a very essential issue of the modern microprocessor design. A memory management unit (MMU) is designed to implement a virtual machine for user programs, and provides a management me...Virtual memory management is always a very essential issue of the modern microprocessor design. A memory management unit (MMU) is designed to implement a virtual machine for user programs, and provides a management mechanism between the operating system and user programs. This paper analyzes the tradeoffs considered in the MMU design of Unity 11 CPU of Peking University, and introduces in detail the solution of pure hardware table walking with two level page table organization. The implementation takes care of required operations and high performances needed by modern operating systems and low costs needed by embedded systems. This solution has been silicon proven, and successfully porting the Linux 2.4.17 kernel, the XWindow system, GNOME and most application software onto the Unity platform.展开更多
The article discusses the possibility of further modernization of the standard microprocessor relay protection of AC overhead system feeders DPA-27.5-TNF, which is operated on the Trans-Baikal Railway by creating an a...The article discusses the possibility of further modernization of the standard microprocessor relay protection of AC overhead system feeders DPA-27.5-TNF, which is operated on the Trans-Baikal Railway by creating an additional automated system of unified templates necessary for the occurrence of “trainability” elements. The templates will be formed via a separate dedicated channel for transmission, processing and storage of the necessary information, not related to the operation of the terminal, with its subsequent visualization at the workplace of the duty personnel of traction substations, together with information from the “GID” software received via another dedicated wired channel. With the help of such a base of unified preset templates, in the future, it will be possible not only to identify the specific causes of each emergency shutdown but also to reduce their number by dynamically adjusting the existing presets of the standard operation algorithm.展开更多
This paper introduces a SF vector control system of a slip frequency controlled induction mo-tor with simple structure,fair performance and convenient operation.It is realized by two singlechip microprocessors and fed...This paper introduces a SF vector control system of a slip frequency controlled induction mo-tor with simple structure,fair performance and convenient operation.It is realized by two singlechip microprocessors and fed from SPWM-GTR inverter.The whole system is combined by twosubsystems,both of them are 8031 single chip microprocessors.The communication between themis coordinated by the full duplex serial port within the chip and ask-and-answer communicationmanner.The error-corrected means adopted has improved the operation reliability of the system.A series of experimental results on a 3 kW induction motor are given at the end of this paper.展开更多
In this paper the authors present an analysis and the implementation of microprocessor-baseddigital phase-locked loop speed control system for an induction motor which is actuated by aSPWM-GTR inverter.The system is c...In this paper the authors present an analysis and the implementation of microprocessor-baseddigital phase-locked loop speed control system for an induction motor which is actuated by aSPWM-GTR inverter.The system is controlled by a 16-bit single chip microprocessor.A new type of frequency and phase detector is presented in detail,An adaptive method isadopted in speed controller.A three mode control scheme is used.These techniques are very use-ful to the improvement of the dynamic behavior of digital AC motor drive system.Experimental results show that the system is of good stability,high precision and good dynam-ic performance.展开更多
We propose a novel scheme, called on-line cache resizing (OCR), to dynamically resize the cache and meet the size requirement of each application. At each periodic interval, the scheme gathers the cache hit-miss sta...We propose a novel scheme, called on-line cache resizing (OCR), to dynamically resize the cache and meet the size requirement of each application. At each periodic interval, the scheme gathers the cache hit-miss statistics at runtime using an extra tag array. These executing statistics serve as inputs to an analytical model of cache energy. The scheme uses energy as a primary metric to dynamically increase/decrease the number of active cache ways for the next interval. The scheme minimizes the active cache size to save energy with minimal performance loss. The simulation with SPEC 2000 benchmarks shows that OCR results in an average of 38.4% energy saving compared with fixed-size caches, with only 2.0% performance loss.展开更多
In the era of Internet of Things, the battery life of edge devices must be extended for sensing connection to the Internet. We aim to reduce the power consumption of the microprocessor embedded in such devices by usin...In the era of Internet of Things, the battery life of edge devices must be extended for sensing connection to the Internet. We aim to reduce the power consumption of the microprocessor embedded in such devices by using a novel dynamically reconfigurable accelerator. Conventional microprocessors consume a large amount of power for memory access, in registers, and for the control of the processor itself rather than computation;this decreases the energy efficiency. Dynamically reconfigurable accelerators reduce such redundant power by computing in parallel on reconfigurable switches and processing element arrays (often consisting of an arithmetic logic unit (ALU) and registers). We propose a novel dynamically reconfigurable accelerator “DYNaSTA” composed of a dynamically reconfigurable data path and static ALU arrays. The static ALU arrays process instructions in parallel without registers and improve energy efficiency. The dynamically reconfigurable data path includes registers and many switches dynamically reconfigured to resolve operand dependencies between instructions mapped on the static ALU array, and forwards appropriate operands to the static ALU array. Therefore, the DYNaSTA accelerator has more flexibility while improving the energy efficiency compared with the conventional dynamically reconfigurable accelerators. We simulated the power consumption of the proposed DYNaSTA accelerator and measured the fabricated chip. As a result, the power consumption was reduced by 69% to 86%, and the energy efficiency improved 4.5 to 13 times compared to a general RISC microprocessor.展开更多
The article discusses the possibility of a potential reduction in the number of operations of microprocessor relay protection of feeders of the contact network of AC railways TsZA-27.5-FKS (FTS) for unknown reasons. R...The article discusses the possibility of a potential reduction in the number of operations of microprocessor relay protection of feeders of the contact network of AC railways TsZA-27.5-FKS (FTS) for unknown reasons. Real statistics on the number of microprocessor relay protection operations at the Buryatskaya traction substation are presented, simulation of the real train situation (in accordance with the regime maps of the throughput capacity of the sections of the Trans-Baikal railway) was carried out in the specialized software complex “KORTES”. Based on the results of the analysis of simulation modeling, the process of forming a unified template of settings using neural network technologies is considered, which characterizes only this specific regular train situation. To protect objects in the event of pre-emergency and emergency modes of operation of the traction power supply system, a variant of changing the standard operation algorithm of the TsZA-27.5-FKS (FTS) terminal by introducing additional blocks for calculating the predictive functions of current and voltage has been proposed.展开更多
UV wavelength auto-tuned tuned output system is realized by the difference method. Controlled by the microprocessor, output wavelength auto- tracking is achieved.Besides, equipment self-checking auto-positioning and t...UV wavelength auto-tuned tuned output system is realized by the difference method. Controlled by the microprocessor, output wavelength auto- tracking is achieved.Besides, equipment self-checking auto-positioning and temperature correct are realized,The wavelength tuned output efficiency in the experiment is better than 97 %.展开更多
This paper proposes a different method to eliminate base wander and power line interference in electrocardiogram, which introduces the integer coefficient filter theory and gives the detail for designing digital filte...This paper proposes a different method to eliminate base wander and power line interference in electrocardiogram, which introduces the integer coefficient filter theory and gives the detail for designing digital filter to remove these two normal noise signals. Signal from the MIT-BIH electrocardiogram database was used to test the performance of the filter. From the test results, the performance of the digital filer is reDT good. The filter coefficient is an integer number, therefore, the filtering algorithm can be successfully implemented on the microprocessor.展开更多
Microprocessors such as those found in PCs and smartphones are complex in their design and nature.In recent years,an increasing number of security vulnerabilities have been found within these microprocessors that can ...Microprocessors such as those found in PCs and smartphones are complex in their design and nature.In recent years,an increasing number of security vulnerabilities have been found within these microprocessors that can leak sensitive user data and information.This report will investigate microarchitecture vulnerabilities focusing on the Spectre and Meltdown exploits and will look at what they do,how they do it and,the real-world impact these vulnerabilities can cause.Additionally,there will be an introduction to the basic concepts of how several PC components operate to support this.展开更多
The paper introduces a temperature control systembased on AT89C51 single-chip-microprocessor, and discussesthe principle , hardware structure, and software design of thissystem in detail.
文摘Recent architectures of multi-core systems may have a relatively large number of cores that typically ranges from tens to hundreds;therefore called many-core systems.Such systems require an efficient interconnection network that tries to address two major problems.First,the overhead of power and area cost and its effect on scalability.Second,high access latency is caused by multiple cores’simultaneous accesses of the same shared module.This paper presents an interconnection scheme called N-conjugate Shuffle Clusters(NCSC)based on multi-core multicluster architecture to reduce the overhead of the just mentioned problems.NCSC eliminated the need for router devices and their complexity and hence reduced the power and area costs.It also resigned and distributed the shared caches across the interconnection network to increase the ability for simultaneous access and hence reduce the access latency.For intra-cluster communication,Multi-port Content Addressable Memory(MPCAM)is used.The experimental results using four clusters and four cores each indicated that the average access latency for a write process is 1.14785±0.04532 ns which is nearly equal to the latency of a write operation in MPCAM.Moreover,it was demonstrated that the average read latency within a cluster is 1.26226±0.090591 ns and around 1.92738±0.139588 ns for read access between cores from different clusters.
基金This work is supported by the National Key Research and Development Plan program of the Ministry of Science and Technology of China(No.2016YFB0201100)Additionally,this work is supported by the National Laboratory for Marine Science and Technology(Qingdao)Major Project of the Aoshan Science and Technology Innovation Program(No.2018ASKJ01-04)the Open Fundation of Key Laboratory of Marine Science and Numerical Simulation,Ministry of Natural Resources(No.2021-YB-02).
文摘In this paper,a typical experiment is carried out based on a high-resolution air-sea coupled model,namely,the coupled ocean-atmosphere-wave-sediment transport(COAWST)model,on both heterogeneous many-core(SW)and homogenous multicore(Intel)supercomputing platforms.We construct a hindcast of Typhoon Lekima on both the SW and Intel platforms,compare the simulation results between these two platforms and compare the key elements of the atmospheric and ocean modules to reanalysis data.The comparative experiment in this typhoon case indicates that the domestic many-core computing platform and general cluster yield almost no differences in the simulated typhoon path and intensity,and the differences in surface pressure(PSFC)in the WRF model and sea surface temperature(SST)in the short-range forecast are very small,whereas a major difference can be identified at high latitudes after the first 10 days.Further heat budget analysis verifies that the differences in SST after 10 days are mainly caused by shortwave radiation variations,as influenced by subsequently generated typhoons in the system.These typhoons generated in the hindcast after the first 10 days attain obviously different trajectories between the two platforms.
文摘With the development of computer technology, network bandwidth and network traffic continue to increase. Considering the large data flow, it is imperative to perform inspection effectively on network packets. In order to find a solution of deep packet inspection which can appropriate to the current network environment, this paper built a deep packet inspection system based on many-core platform, and in this way, verified the feasibility to implement a deep packet inspection system under many-core platform with both high performance and low consumption. After testing and analysis of the system performance, it has been found that the deep packet inspection based on many-core platform TILE_Gx36 [1] [2] can process network traffic of which the bandwidth reaches up to 4 Gbps. To a certain extent, the performance has improved compared to most deep packet inspection system based on X86 platform at present.
文摘Simulators are generally used during the design of computer architectures. Typically, different simulators with different levels of complexity, speed and accuracy are used. However, for early design space exploration, simulators with less complexity, high simulation speed and reasonable accuracy are desired. It is also required that these simulators have a short development time and that changes in the design require less effort in the implementation in order to perform experiments and see the effects of changes in the design. These simulators are termed high-level simulators in the context of computer architecture. In this paper, we present multiple levels of abstractions in a high-level simulation of a general-purpose many-core system, where the objective of every level is to improve the accuracy in simulation without significantly affecting the complexity and simulation speed.
文摘A model following adaptive control system for CSIM is presented in this paper. A dynamic mathematical model of slip control based system is obtained. With the help of model reducing technique, full order model is reduced to simplify the design without degrading much of the performance. Model following adaptive control laws in discrete form are derived. These laws satisfy the hyperstability condition for taking care of the load and machine parameter changes of the drive. A microprocessor 8098 is used to develop the speed controller. The implementation of the control system uses only available variables of the reference model and the controlled plant. Experimental results are given to demonstrate the good performance of the system.
基金Supported by Research Project of Henan Science and Technology Foundation(0124110209,0211061900).
文摘A new kind of simple and flexible CO2 welding system was developed to carry out waveform control. The system consisted of IGBT inverter, PWM circuit and microprocessor unit ( MPU) , in which the output current of constant current (CC) power supply could be changed according to transient physical state, and the variable down slope rate control could be used to ensure a stable welding process. The welding experiment results proved the effectiveness of this control approach.
文摘Virtual memory management is always a very essential issue of the modern microprocessor design. A memory management unit (MMU) is designed to implement a virtual machine for user programs, and provides a management mechanism between the operating system and user programs. This paper analyzes the tradeoffs considered in the MMU design of Unity 11 CPU of Peking University, and introduces in detail the solution of pure hardware table walking with two level page table organization. The implementation takes care of required operations and high performances needed by modern operating systems and low costs needed by embedded systems. This solution has been silicon proven, and successfully porting the Linux 2.4.17 kernel, the XWindow system, GNOME and most application software onto the Unity platform.
文摘The article discusses the possibility of further modernization of the standard microprocessor relay protection of AC overhead system feeders DPA-27.5-TNF, which is operated on the Trans-Baikal Railway by creating an additional automated system of unified templates necessary for the occurrence of “trainability” elements. The templates will be formed via a separate dedicated channel for transmission, processing and storage of the necessary information, not related to the operation of the terminal, with its subsequent visualization at the workplace of the duty personnel of traction substations, together with information from the “GID” software received via another dedicated wired channel. With the help of such a base of unified preset templates, in the future, it will be possible not only to identify the specific causes of each emergency shutdown but also to reduce their number by dynamically adjusting the existing presets of the standard operation algorithm.
文摘This paper introduces a SF vector control system of a slip frequency controlled induction mo-tor with simple structure,fair performance and convenient operation.It is realized by two singlechip microprocessors and fed from SPWM-GTR inverter.The whole system is combined by twosubsystems,both of them are 8031 single chip microprocessors.The communication between themis coordinated by the full duplex serial port within the chip and ask-and-answer communicationmanner.The error-corrected means adopted has improved the operation reliability of the system.A series of experimental results on a 3 kW induction motor are given at the end of this paper.
文摘In this paper the authors present an analysis and the implementation of microprocessor-baseddigital phase-locked loop speed control system for an induction motor which is actuated by aSPWM-GTR inverter.The system is controlled by a 16-bit single chip microprocessor.A new type of frequency and phase detector is presented in detail,An adaptive method isadopted in speed controller.A three mode control scheme is used.These techniques are very use-ful to the improvement of the dynamic behavior of digital AC motor drive system.Experimental results show that the system is of good stability,high precision and good dynam-ic performance.
基金The High Technology Research and Development Program of China (No.2006AA01Z226)the Natural Science Foundation of Hubei (No.2007ABD002)the Ministry of Education-INTEL Information Technology Foundation (No.MOE-INTEL-08-05)
文摘We propose a novel scheme, called on-line cache resizing (OCR), to dynamically resize the cache and meet the size requirement of each application. At each periodic interval, the scheme gathers the cache hit-miss statistics at runtime using an extra tag array. These executing statistics serve as inputs to an analytical model of cache energy. The scheme uses energy as a primary metric to dynamically increase/decrease the number of active cache ways for the next interval. The scheme minimizes the active cache size to save energy with minimal performance loss. The simulation with SPEC 2000 benchmarks shows that OCR results in an average of 38.4% energy saving compared with fixed-size caches, with only 2.0% performance loss.
文摘In the era of Internet of Things, the battery life of edge devices must be extended for sensing connection to the Internet. We aim to reduce the power consumption of the microprocessor embedded in such devices by using a novel dynamically reconfigurable accelerator. Conventional microprocessors consume a large amount of power for memory access, in registers, and for the control of the processor itself rather than computation;this decreases the energy efficiency. Dynamically reconfigurable accelerators reduce such redundant power by computing in parallel on reconfigurable switches and processing element arrays (often consisting of an arithmetic logic unit (ALU) and registers). We propose a novel dynamically reconfigurable accelerator “DYNaSTA” composed of a dynamically reconfigurable data path and static ALU arrays. The static ALU arrays process instructions in parallel without registers and improve energy efficiency. The dynamically reconfigurable data path includes registers and many switches dynamically reconfigured to resolve operand dependencies between instructions mapped on the static ALU array, and forwards appropriate operands to the static ALU array. Therefore, the DYNaSTA accelerator has more flexibility while improving the energy efficiency compared with the conventional dynamically reconfigurable accelerators. We simulated the power consumption of the proposed DYNaSTA accelerator and measured the fabricated chip. As a result, the power consumption was reduced by 69% to 86%, and the energy efficiency improved 4.5 to 13 times compared to a general RISC microprocessor.
文摘The article discusses the possibility of a potential reduction in the number of operations of microprocessor relay protection of feeders of the contact network of AC railways TsZA-27.5-FKS (FTS) for unknown reasons. Real statistics on the number of microprocessor relay protection operations at the Buryatskaya traction substation are presented, simulation of the real train situation (in accordance with the regime maps of the throughput capacity of the sections of the Trans-Baikal railway) was carried out in the specialized software complex “KORTES”. Based on the results of the analysis of simulation modeling, the process of forming a unified template of settings using neural network technologies is considered, which characterizes only this specific regular train situation. To protect objects in the event of pre-emergency and emergency modes of operation of the traction power supply system, a variant of changing the standard operation algorithm of the TsZA-27.5-FKS (FTS) terminal by introducing additional blocks for calculating the predictive functions of current and voltage has been proposed.
文摘UV wavelength auto-tuned tuned output system is realized by the difference method. Controlled by the microprocessor, output wavelength auto- tracking is achieved.Besides, equipment self-checking auto-positioning and temperature correct are realized,The wavelength tuned output efficiency in the experiment is better than 97 %.
文摘This paper proposes a different method to eliminate base wander and power line interference in electrocardiogram, which introduces the integer coefficient filter theory and gives the detail for designing digital filter to remove these two normal noise signals. Signal from the MIT-BIH electrocardiogram database was used to test the performance of the filter. From the test results, the performance of the digital filer is reDT good. The filter coefficient is an integer number, therefore, the filtering algorithm can be successfully implemented on the microprocessor.
文摘Microprocessors such as those found in PCs and smartphones are complex in their design and nature.In recent years,an increasing number of security vulnerabilities have been found within these microprocessors that can leak sensitive user data and information.This report will investigate microarchitecture vulnerabilities focusing on the Spectre and Meltdown exploits and will look at what they do,how they do it and,the real-world impact these vulnerabilities can cause.Additionally,there will be an introduction to the basic concepts of how several PC components operate to support this.
文摘The paper introduces a temperature control systembased on AT89C51 single-chip-microprocessor, and discussesthe principle , hardware structure, and software design of thissystem in detail.