In this paper, we discuss the parallel domain decomposition method(DDM)for solving PDE's on parallel computers. Three types of DDM: DDM with overlapping, DDM without overlapping and DDM with fictitious component a...In this paper, we discuss the parallel domain decomposition method(DDM)for solving PDE's on parallel computers. Three types of DDM: DDM with overlapping, DDM without overlapping and DDM with fictitious component are discussed in a uniform framework. The eonvergence of the asynchronous parallel algorithms based on DDM are discussed.展开更多
Previously,a single data-path stack was adequate for data-path chips,and the complexity and size of the data-path was comparatively small.As current data-path chips,such as system-on-a-chip (SOC),become more complex,m...Previously,a single data-path stack was adequate for data-path chips,and the complexity and size of the data-path was comparatively small.As current data-path chips,such as system-on-a-chip (SOC),become more complex,multiple data-path stacks are required to implement the entire data-path.As more data-path stacks are integrated into SOC,data-path is becoming a critical part of the whole giga-scale integrated circuits (GSI) design.The traditional physical design methodology can not satisfy the data-path performance requirements,because it can not accommodate the data-path bit-sliced structure and the strict performance (such as timing,coupling,and crosstalk) constraints.Challenges in the data-path physical design are addressed.The fundamental problems and key technologies in data-path physical design are analysed.The corresponding researches and solutions in this research field are also discussed.展开更多
This paper presents a new test data compression/decompression method for SoC testing,called hybrid run length codes. The method makes a full analysis of the factors which influence test parameters:compression ratio,t...This paper presents a new test data compression/decompression method for SoC testing,called hybrid run length codes. The method makes a full analysis of the factors which influence test parameters:compression ratio,test application time, and area overhead. To improve the compression ratio, the new method is based on variable-to-variable run length codes,and a novel algorithm is proposed to reorder the test vectors and fill the unspecified bits in the pre-processing step. With a novel on-chip decoder, low test application time and low area overhead are obtained by hybrid run length codes. Finally, an experimental comparison on ISCAS 89 benchmark circuits validates the proposed method展开更多
Dynamic voltage scaling (DVS), supported by many DVS-enabled processors, is an efficient technique for energy-efficient embedded systems. Many researchers work on DVS and have presented various DVS algorithms, some wi...Dynamic voltage scaling (DVS), supported by many DVS-enabled processors, is an efficient technique for energy-efficient embedded systems. Many researchers work on DVS and have presented various DVS algorithms, some with quite good results. However, the previous algorithms either have a large time complexity or obtain results sensitive to the count of the voltage modes. Fine-grained voltage modes lead to optimal results, but coarse-grained voltage modes cause less optimal one. A new algorithm is presented, which is based on ant colony optimization, called ant colony optimization voltage and task scheduling (ACO-VTS) with a low time complexity implemented by parallelizing and its linear time approximation algorithm. Both of them generate quite good results, saving up to 30% more energy than that of the previous ones under coarse-grained modes, and their results don’t depend on the number of modes available.展开更多
A hardware-software co-simulation method for system on chip (SOC) design is discussed. It is based on an instruction set simulator (ISS) and an event-driven hardware simulator, and a bus interface model that is descri...A hardware-software co-simulation method for system on chip (SOC) design is discussed. It is based on an instruction set simulator (ISS) and an event-driven hardware simulator, and a bus interface model that is described in C language provides the interface between the two. The bus interface model and the ISS are linked into a singleton program--the software simulator, which communicate with the hardware simulator through Windows sockets. The implementation of the bus interface model and the synchronization between hardware and software simulator are discussed in detail. Co-simulation control of the hardware simulator is also discussed.展开更多
To solve computationally expensive problems, multiple processor SoCs (MPSoCs) are frequently used. Mapping of applications to MPSoC architectures and scheduling of tasks are key problems in system level design of embe...To solve computationally expensive problems, multiple processor SoCs (MPSoCs) are frequently used. Mapping of applications to MPSoC architectures and scheduling of tasks are key problems in system level design of embedded systems. In this paper, a cluster slack optimization algorithm is described, in which the tasks in a cluster are simultaneously mapped and scheduled for heterogeneous MPSoC architectures. In our approach, the tasks are iteratively clustered and each cluster is optimized by using the branch and bound technique to capitalize on slack distribution. The proposed static task mapping and scheduling method is applied to pipelined data stream processing as well as for batch processing. In pipelined processing, the tradeoff between throughput and memory cost can be exploited by adjusting a weighting parameter. Furthermore, an energy-aware task mapping and scheduling algorithm based on our cluster slack optimization is developed. Experimental results show improvement in latency, throughput and energy.展开更多
It is important to evaluate function behaviors and performance features of task scheduling algorithm in the multi-processor system.A novel dynamic measurement method(DMM)was proposed to measure the task scheduling alg...It is important to evaluate function behaviors and performance features of task scheduling algorithm in the multi-processor system.A novel dynamic measurement method(DMM)was proposed to measure the task scheduling algorithm’s correctness and dependability.In a multi-processor system,task scheduling problem is represented by a combinatorial evaluation model,interactive Markov chain(IMC),and solution space of the algorithm with time and probability metrics is described by action-based continuous stochastic logic(aCSL).DMM derives a path by logging runtime scheduling actions and corresponding times.Through judging whether the derived path can be received by task scheduling IMC model,DMM analyses the correctness of algorithm.Through judging whether the actual values satisfy label function of the initial state,DMM analyses the dependability of algorithm.The simulation shows that DMM can effectively characterize the function behaviors and performance features of task scheduling algorithm.展开更多
This paper presents a test resource partitioning technique based on anefficient response compaction design called quotient compactor(q-Compactor). Because q-Compactor isa single-output compactor, high compaction ratio...This paper presents a test resource partitioning technique based on anefficient response compaction design called quotient compactor(q-Compactor). Because q-Compactor isa single-output compactor, high compaction ratios can be obtained even for chips with a small numberof outputs. Some theorems for the design of q-Compactor are presented to achieve full diagnosticability, minimize error cancellation and handle unknown bits in the outputs of the circuit undertest (CUT). The q-Compactor can also be moved to the load-board, so as to compact the outputresponse of the CUT even during functional testing. Therefore, the number of tester channelsrequired to test the chip is significantly reduced. The experimental results on the ISCAS ''89benchmark circuits and an MPEG 2 decoder SoC show that the proposed compaction scheme is veryefficient.展开更多
A hybrid decomposition method for molecular dynamics simulations was presented, using simul- taneously spatial decomposition and force decomposition to fit the architecture of a cluster of symmetric multi-processo...A hybrid decomposition method for molecular dynamics simulations was presented, using simul- taneously spatial decomposition and force decomposition to fit the architecture of a cluster of symmetric multi-processor (SMP) nodes. The method distributes particles between nodes based on the spatial decom- position strategy to reduce inter-node communication costs. The method also partitions particle pairs within each node using the force decomposition strategy to improve the load balance for each node. Simulation results for a nucleation process with 4 000 000 particles show that the hybrid method achieves better paral- lel performance than either spatial or force decomposition alone, especially when applied to a large scale particle system with non-uniform spatial density.展开更多
Instruction Set Simulator (ISS) is a highly abstracted and executable model of micro architecture. It is widely used in the fields of verification and debugging during the development of microprocessors. However, wi...Instruction Set Simulator (ISS) is a highly abstracted and executable model of micro architecture. It is widely used in the fields of verification and debugging during the development of microprocessors. However, with the emergence of Chip Multi-Processors, the single-core ISS cannot meet the needs of microprocessor development. In this paper, we introduce our multi-core chip architecture first, after that a general methodology to expand a single-core ISS to a multi- core ISS (MCISS) is proposed. On this basis, a real-time comparison environment is created for multi-core verification, and the problems of multi-core communication and synchronization are addressed gracefully. With the "save and restore" mechanism, the verification procedure and the debugging are speeding up greatly.展开更多
With the wide application of electronic hardware in aircraft such as air-to-ground communication,satellite communication,positioning system and so on,aircraft hardware is facing great secure pressure.Focusing on the s...With the wide application of electronic hardware in aircraft such as air-to-ground communication,satellite communication,positioning system and so on,aircraft hardware is facing great secure pressure.Focusing on the secure problem of aircraft hardware,this paper proposes a supervisory control architecture based on secure System-on-a-Chip(So C)system.The proposed architecture is attack-immune and trustworthy,which can support trusted escrow application and Dynamic Integrity Measurement(DIM)without interference.This architecture is characterized by a Trusted Monitoring System(TMS)hardware isolated from the Main Processor System(MPS),a secure access channel from TMS to the running memory of the MPS,and the channel is unidirectional.Based on this architecture,the DIM program running on TMS is used to measure and call the Lightweight Measurement Agent(LMA)program running on MPS.By this method,the Operating System(OS)kernel,key software and data of the MPS can be dynamically measured without disturbance,which makes it difficult for adversaries to attack through software.Besides,this architecture has been fully verified on FPGA prototype system.Compared with the existing systems,our architecture achieves higher security and is more efficient on DIM,which can fully supervise the running of application and aircraft hardware OS.展开更多
It is often the case that in the development of a system-on-a-chip(SoC)design,a family of SystemC transaction level models(TLM)is created.TLMs in the same family often share common functionalities but differ in their ...It is often the case that in the development of a system-on-a-chip(SoC)design,a family of SystemC transaction level models(TLM)is created.TLMs in the same family often share common functionalities but differ in their timing,implementation,configuration and performance in various SoC developing phases.In most cases,all the TLMs in a family must be verified for the follow-up design activities.In our previous work,we proposed to call such family TLM product line(TPL),and proposed feature-oriented(FO)design methodology for efficient TPL development.However,developers can only verify TLM in a family one by one,which causes large portion of duplicated verification overhead.Therefore,in our proposed methodology,functional verification of TPL has become a bottleneck.In this paper,we proposed a novel TPL verification method for FO designs.In our method,for the given property,we can exponentially reduce the number of TLMs to be verified by identifying mutefeature-modules(MFM),which will avoid duplicated veri-fication.The proposed method is presented in informal and formal way,and the correctness of it is proved.The theoretical analysis and experimental results on a real design show the correctness and efficiency of the proposed method.展开更多
A unilied vector sorting algorithm (VSA) is proposed, which sorts N arbitrary num-bers with clog. N-bits on an SIMD multi-processor system (SMMP) with processors and a composite interconnected network in time, where c...A unilied vector sorting algorithm (VSA) is proposed, which sorts N arbitrary num-bers with clog. N-bits on an SIMD multi-processor system (SMMP) with processors and a composite interconnected network in time, where c is an arbitrary positive constant. When is an arbitrary small posi-tive constant and u = log2 N, it is an O(logN) algorithm and when it is an optimal algorithm,pT = O(N log N)); where u = 1, c = 1 and e = 0.5 (a constant).展开更多
Since softswitch is the kernel of the Next Generation Network (NGN), it is practically significant to improve the availability of the softswitch system. This paper expatiates upon the methods of realizing the high-a...Since softswitch is the kernel of the Next Generation Network (NGN), it is practically significant to improve the availability of the softswitch system. This paper expatiates upon the methods of realizing the high-availability of softswitch system. It gives the methods from a multi-level viewpoint : software-level high-availability design, platformlevel high-availability of softswitch kernel components, network-level high-availability. Additonally, it gives certain analysis on obtaining network high-availability.展开更多
A novel amperometric immunosensor based on the micro electromechanical systems (MEMS) technology, using protein A and self-assembled monolayers (SAMs) for the orientation-controlled immobilization of antibodies, h...A novel amperometric immunosensor based on the micro electromechanical systems (MEMS) technology, using protein A and self-assembled monolayers (SAMs) for the orientation-controlled immobilization of antibodies, has been developed. Using MEMS technology, an "Au, Pt, Pt" three-microelectrode system enclosed in a SU-8 micro pool was fabricated. Employing SAMs, a monolayer of protein A was immobilized on the cysteamine modified Au electrode to achieve the orientation-controlled immobilization of the human immunoglobulin (HIgG) antibody. The immunosensor aimed at low unit cost, small dimension, high level of integration and the prospect of a biosensor system-on-a-chip. Cyclic voltammetry and chronoamperometry were conducted to characterize the immunosensor. Compared with the traditional immunosensor using bulky gold electrode or screen-printed electrode and the procedure directly binding protein A to electrode for immobilization of antibodies, it had attractive advantages, such as miniaturization, compatibility with CMOS technology, fast response (30 s), broad linear range (50-400 pg/L) and low detection limit (10 pg/L) for HIgG. In addition, this immunosensor was easy to be designed into micro array and to realize the simultaneously multi-parameter detection.展开更多
基金The project supported by National Natural Science Fundation of China.
文摘In this paper, we discuss the parallel domain decomposition method(DDM)for solving PDE's on parallel computers. Three types of DDM: DDM with overlapping, DDM without overlapping and DDM with fictitious component are discussed in a uniform framework. The eonvergence of the asynchronous parallel algorithms based on DDM are discussed.
文摘Previously,a single data-path stack was adequate for data-path chips,and the complexity and size of the data-path was comparatively small.As current data-path chips,such as system-on-a-chip (SOC),become more complex,multiple data-path stacks are required to implement the entire data-path.As more data-path stacks are integrated into SOC,data-path is becoming a critical part of the whole giga-scale integrated circuits (GSI) design.The traditional physical design methodology can not satisfy the data-path performance requirements,because it can not accommodate the data-path bit-sliced structure and the strict performance (such as timing,coupling,and crosstalk) constraints.Challenges in the data-path physical design are addressed.The fundamental problems and key technologies in data-path physical design are analysed.The corresponding researches and solutions in this research field are also discussed.
文摘This paper presents a new test data compression/decompression method for SoC testing,called hybrid run length codes. The method makes a full analysis of the factors which influence test parameters:compression ratio,test application time, and area overhead. To improve the compression ratio, the new method is based on variable-to-variable run length codes,and a novel algorithm is proposed to reorder the test vectors and fill the unspecified bits in the pre-processing step. With a novel on-chip decoder, low test application time and low area overhead are obtained by hybrid run length codes. Finally, an experimental comparison on ISCAS 89 benchmark circuits validates the proposed method
基金the National"973"Basic Research Programof China (2004CB318202)
文摘Dynamic voltage scaling (DVS), supported by many DVS-enabled processors, is an efficient technique for energy-efficient embedded systems. Many researchers work on DVS and have presented various DVS algorithms, some with quite good results. However, the previous algorithms either have a large time complexity or obtain results sensitive to the count of the voltage modes. Fine-grained voltage modes lead to optimal results, but coarse-grained voltage modes cause less optimal one. A new algorithm is presented, which is based on ant colony optimization, called ant colony optimization voltage and task scheduling (ACO-VTS) with a low time complexity implemented by parallelizing and its linear time approximation algorithm. Both of them generate quite good results, saving up to 30% more energy than that of the previous ones under coarse-grained modes, and their results don’t depend on the number of modes available.
文摘A hardware-software co-simulation method for system on chip (SOC) design is discussed. It is based on an instruction set simulator (ISS) and an event-driven hardware simulator, and a bus interface model that is described in C language provides the interface between the two. The bus interface model and the ISS are linked into a singleton program--the software simulator, which communicate with the hardware simulator through Windows sockets. The implementation of the bus interface model and the synchronization between hardware and software simulator are discussed in detail. Co-simulation control of the hardware simulator is also discussed.
文摘To solve computationally expensive problems, multiple processor SoCs (MPSoCs) are frequently used. Mapping of applications to MPSoC architectures and scheduling of tasks are key problems in system level design of embedded systems. In this paper, a cluster slack optimization algorithm is described, in which the tasks in a cluster are simultaneously mapped and scheduled for heterogeneous MPSoC architectures. In our approach, the tasks are iteratively clustered and each cluster is optimized by using the branch and bound technique to capitalize on slack distribution. The proposed static task mapping and scheduling method is applied to pipelined data stream processing as well as for batch processing. In pipelined processing, the tradeoff between throughput and memory cost can be exploited by adjusting a weighting parameter. Furthermore, an energy-aware task mapping and scheduling algorithm based on our cluster slack optimization is developed. Experimental results show improvement in latency, throughput and energy.
基金the National Natural Science Foundation of China(Nos.11371003 and 11461006)the Special Fund for Scientific and Technological Bases and Talents of Guangxi(No.2016AD05050)+3 种基金the Special Fund for Bagui Scholars of Guangxithe Major Tendering Project of the National Social Science Foundation(No.17ZDA160)the Sichuan Science and Technology Project(No.19YYJC0038)the Fundamental Research Funds for the Central Universities,SWUN(No.2019NYB20)
文摘It is important to evaluate function behaviors and performance features of task scheduling algorithm in the multi-processor system.A novel dynamic measurement method(DMM)was proposed to measure the task scheduling algorithm’s correctness and dependability.In a multi-processor system,task scheduling problem is represented by a combinatorial evaluation model,interactive Markov chain(IMC),and solution space of the algorithm with time and probability metrics is described by action-based continuous stochastic logic(aCSL).DMM derives a path by logging runtime scheduling actions and corresponding times.Through judging whether the derived path can be received by task scheduling IMC model,DMM analyses the correctness of algorithm.Through judging whether the actual values satisfy label function of the initial state,DMM analyses the dependability of algorithm.The simulation shows that DMM can effectively characterize the function behaviors and performance features of task scheduling algorithm.
基金国家自然科学基金,the Sci. & Technol. Project of Beijing,中国科学院资助项目,Synopsys公司资助项目
文摘This paper presents a test resource partitioning technique based on anefficient response compaction design called quotient compactor(q-Compactor). Because q-Compactor isa single-output compactor, high compaction ratios can be obtained even for chips with a small numberof outputs. Some theorems for the design of q-Compactor are presented to achieve full diagnosticability, minimize error cancellation and handle unknown bits in the outputs of the circuit undertest (CUT). The q-Compactor can also be moved to the load-board, so as to compact the outputresponse of the CUT even during functional testing. Therefore, the number of tester channelsrequired to test the chip is significantly reduced. The experimental results on the ISCAS ''89benchmark circuits and an MPEG 2 decoder SoC show that the proposed compaction scheme is veryefficient.
基金Supported by the "985" Basic Research Foundation of Tsinghua University of China (No. JC2001024)
文摘A hybrid decomposition method for molecular dynamics simulations was presented, using simul- taneously spatial decomposition and force decomposition to fit the architecture of a cluster of symmetric multi-processor (SMP) nodes. The method distributes particles between nodes based on the spatial decom- position strategy to reduce inter-node communication costs. The method also partitions particle pairs within each node using the force decomposition strategy to improve the load balance for each node. Simulation results for a nucleation process with 4 000 000 particles show that the hybrid method achieves better paral- lel performance than either spatial or force decomposition alone, especially when applied to a large scale particle system with non-uniform spatial density.
文摘Instruction Set Simulator (ISS) is a highly abstracted and executable model of micro architecture. It is widely used in the fields of verification and debugging during the development of microprocessors. However, with the emergence of Chip Multi-Processors, the single-core ISS cannot meet the needs of microprocessor development. In this paper, we introduce our multi-core chip architecture first, after that a general methodology to expand a single-core ISS to a multi- core ISS (MCISS) is proposed. On this basis, a real-time comparison environment is created for multi-core verification, and the problems of multi-core communication and synchronization are addressed gracefully. With the "save and restore" mechanism, the verification procedure and the debugging are speeding up greatly.
基金supported by the National Key Research and Development Program of China(No.2017YFB0802502)by the Aeronautical Science Foundation(No.2017ZC51038)+4 种基金by the National Natural Science Foundation of China(Nos.62002006,61702028,61672083,61370190,61772538,61532021,61472429,and 61402029)by the Foundation of Science and Technology on Information Assurance Laboratory(No.1421120305162112006)by the National Cryptography Development Fund(No.MMJJ20170106)by the Defense Industrial Technology Development Program(No.JCKY2016204A102)by the Liaoning Collaboration Innovation Center For CSLE,China。
文摘With the wide application of electronic hardware in aircraft such as air-to-ground communication,satellite communication,positioning system and so on,aircraft hardware is facing great secure pressure.Focusing on the secure problem of aircraft hardware,this paper proposes a supervisory control architecture based on secure System-on-a-Chip(So C)system.The proposed architecture is attack-immune and trustworthy,which can support trusted escrow application and Dynamic Integrity Measurement(DIM)without interference.This architecture is characterized by a Trusted Monitoring System(TMS)hardware isolated from the Main Processor System(MPS),a secure access channel from TMS to the running memory of the MPS,and the channel is unidirectional.Based on this architecture,the DIM program running on TMS is used to measure and call the Lightweight Measurement Agent(LMA)program running on MPS.By this method,the Operating System(OS)kernel,key software and data of the MPS can be dynamically measured without disturbance,which makes it difficult for adversaries to attack through software.Besides,this architecture has been fully verified on FPGA prototype system.Compared with the existing systems,our architecture achieves higher security and is more efficient on DIM,which can fully supervise the running of application and aircraft hardware OS.
基金The work was supported by the National Key R&D Program of China(2018YFB1004202)by Laboratory of Software Engineering for Complex Systems.
文摘It is often the case that in the development of a system-on-a-chip(SoC)design,a family of SystemC transaction level models(TLM)is created.TLMs in the same family often share common functionalities but differ in their timing,implementation,configuration and performance in various SoC developing phases.In most cases,all the TLMs in a family must be verified for the follow-up design activities.In our previous work,we proposed to call such family TLM product line(TPL),and proposed feature-oriented(FO)design methodology for efficient TPL development.However,developers can only verify TLM in a family one by one,which causes large portion of duplicated verification overhead.Therefore,in our proposed methodology,functional verification of TPL has become a bottleneck.In this paper,we proposed a novel TPL verification method for FO designs.In our method,for the given property,we can exponentially reduce the number of TLMs to be verified by identifying mutefeature-modules(MFM),which will avoid duplicated veri-fication.The proposed method is presented in informal and formal way,and the correctness of it is proved.The theoretical analysis and experimental results on a real design show the correctness and efficiency of the proposed method.
文摘A unilied vector sorting algorithm (VSA) is proposed, which sorts N arbitrary num-bers with clog. N-bits on an SIMD multi-processor system (SMMP) with processors and a composite interconnected network in time, where c is an arbitrary positive constant. When is an arbitrary small posi-tive constant and u = log2 N, it is an O(logN) algorithm and when it is an optimal algorithm,pT = O(N log N)); where u = 1, c = 1 and e = 0.5 (a constant).
文摘Since softswitch is the kernel of the Next Generation Network (NGN), it is practically significant to improve the availability of the softswitch system. This paper expatiates upon the methods of realizing the high-availability of softswitch system. It gives the methods from a multi-level viewpoint : software-level high-availability design, platformlevel high-availability of softswitch kernel components, network-level high-availability. Additonally, it gives certain analysis on obtaining network high-availability.
基金supported by the National Natural Science Foundation of China(Grant No.90307014).
文摘A novel amperometric immunosensor based on the micro electromechanical systems (MEMS) technology, using protein A and self-assembled monolayers (SAMs) for the orientation-controlled immobilization of antibodies, has been developed. Using MEMS technology, an "Au, Pt, Pt" three-microelectrode system enclosed in a SU-8 micro pool was fabricated. Employing SAMs, a monolayer of protein A was immobilized on the cysteamine modified Au electrode to achieve the orientation-controlled immobilization of the human immunoglobulin (HIgG) antibody. The immunosensor aimed at low unit cost, small dimension, high level of integration and the prospect of a biosensor system-on-a-chip. Cyclic voltammetry and chronoamperometry were conducted to characterize the immunosensor. Compared with the traditional immunosensor using bulky gold electrode or screen-printed electrode and the procedure directly binding protein A to electrode for immobilization of antibodies, it had attractive advantages, such as miniaturization, compatibility with CMOS technology, fast response (30 s), broad linear range (50-400 pg/L) and low detection limit (10 pg/L) for HIgG. In addition, this immunosensor was easy to be designed into micro array and to realize the simultaneously multi-parameter detection.