In order to improve the system reliability and performance and to reduce the system cost, volume and weight, we have designed, fabricated and tested the multibus adapter system of a trimodular redundant fault tolerant...In order to improve the system reliability and performance and to reduce the system cost, volume and weight, we have designed, fabricated and tested the multibus adapter system of a trimodular redundant fault tolerant computer system on a single chip of 5000 gate CMOS gate array. The design, fabrication and test of this single chip system will be discussed..展开更多
In recent years,the environment of railways and the systems such as CBTC(communication based train control)have been changing.To respond the changes and the needs of customers,a UTCS(unified train control system)has b...In recent years,the environment of railways and the systems such as CBTC(communication based train control)have been changing.To respond the changes and the needs of customers,a UTCS(unified train control system)has been developed to realize a system that evolves with customers.Previous type systems consist of independent components such as ATC(Automatic train control)system,electronic interlocking system,and facility monitoring system,and there are a complicated overlap of system configurations and functions and difference in concept between the systems.On the other hand,the integrated train control system consists of horizontal layers such as function layer,network layer,and terminal layer.Therefore,the system has been developed to make it simple with no unnecessary redundancy and evolving to meet the needs of customers.In this paper,we explain a method that realizes the interlocking function for CBTC system in the function layer based on the concept of“securing a train travelling path”including path blocking and routing,and evaluate the safety of the method using STAMP/STPA.展开更多
With the increase of system scale, the inherent reliability of supercomputers becomes lower and lower. The cost of fault handling and task recovery increases so rapidly that the reliability issue will soon harm the us...With the increase of system scale, the inherent reliability of supercomputers becomes lower and lower. The cost of fault handling and task recovery increases so rapidly that the reliability issue will soon harm the usability of supercomputers. This issue is referred to as the "reliability wall", which is regarded as a critical problem for current and future supercomputers. To address this problem, we propose an autonomous fault-tolerant system, named Iaso, in MilkyWay- 2 system. Iaso introduces the concept of autonomous management in supercomputers. By autonomous management, the computer itself, rather than manpower, takes charge of the fault management work. Iaso automatically manage the whole lifecycle of faults, including fault detection, fault diagnosis, fault isolation, and task recovery. Iaso endows the autonomous features with MilkyWay-2 system, such as self-awareness, self-diagnosis, self-healing, and self-protection. With the help of Iaso, the cost of fault handling in supercomputers reduces from several hours to a few seconds. Iaso greatly improves the usability and reliability of MilkyWay-2 system.展开更多
This paper describes an onboard computer with dual processing modules. Each processing module is composed of 32 bit ARM reduced instruction set computer processor and other commercial-off-the-shelf devices. A set of f...This paper describes an onboard computer with dual processing modules. Each processing module is composed of 32 bit ARM reduced instruction set computer processor and other commercial-off-the-shelf devices. A set of fault handling mechanisms is implemented in the computer system, which enables the system to tolerate a single fault. The onboard software is organized around a set of processes that communicate among each other through a routing process. Meeting an extremely tight set of constraints that include mass, volume, power consumption and space environmental conditions, the fault-tolerant onboard computer has excellent data processing capability that can meet the erquirements of micro-satellite missions.展开更多
Effective fault diagnosis and fault-tolerant control method for aeronautics electromechanical actuator is concerned in this paper.By borrowing the advantages of model-driven and data-driven methods,a fault tolerant no...Effective fault diagnosis and fault-tolerant control method for aeronautics electromechanical actuator is concerned in this paper.By borrowing the advantages of model-driven and data-driven methods,a fault tolerant nonsingular terminal sliding mode control method based on support vector machine(SVM)is proposed.A SVM is designed to estimate the fault by off-line learning from small sample data with solving convex quadratic programming method and is introduced into a high-gain observer,so as to improve the state estimation and fault detection accuracy when the fault occurs.The state estimation value of the observer is used for state reconfiguration.A novel nonsingular terminal sliding mode surface is designed,and Lyapunov theorem is used to derive a parameter adaptation law and a control law.It is guaranteed that the proposed controller can achieve asymptotical stability which is superior to many advanced fault-tolerant controllers.In addition,the parameter estimation also can help to diagnose the system faults because the faults can be reflected by the parameters variation.Extensive comparative simulation and experimental results illustrate the effectiveness and advancement of the proposed controller compared with several other main-stream controllers.展开更多
Readout errors caused by measurement noise are a significant source of errors in quantum circuits,which severely affect the output results and are an urgent problem to be solved in noisy-intermediate scale quantum(NIS...Readout errors caused by measurement noise are a significant source of errors in quantum circuits,which severely affect the output results and are an urgent problem to be solved in noisy-intermediate scale quantum(NISQ)computing.In this paper,we use the bit-flip averaging(BFA)method to mitigate frequent readout errors in quantum generative adversarial networks(QGAN)for image generation,which simplifies the response matrix structure by averaging the qubits for each random bit-flip in advance,successfully solving problems with high cost of measurement for traditional error mitigation methods.Our experiments were simulated in Qiskit using the handwritten digit image recognition dataset under the BFA-based method,the Kullback-Leibler(KL)divergence of the generated images converges to 0.04,0.05,and 0.1 for readout error probabilities of p=0.01,p=0.05,and p=0.1,respectively.Additionally,by evaluating the fidelity of the quantum states representing the images,we observe average fidelity values of 0.97,0.96,and 0.95 for the three readout error probabilities,respectively.These results demonstrate the robustness of the model in mitigating readout errors and provide a highly fault tolerant mechanism for image generation models.展开更多
For permanent faults(PF)in the power communication network(PCN),such as link interruptions,the timesensitive networking(TSN)relied on by PCN,typically employs spatial redundancy fault-tolerance methods to keep service...For permanent faults(PF)in the power communication network(PCN),such as link interruptions,the timesensitive networking(TSN)relied on by PCN,typically employs spatial redundancy fault-tolerance methods to keep service stability and reliability,which often limits TSN scheduling performance in fault-free ideal states.So this paper proposes a graph attention residual network-based routing and fault-tolerant scheduling mechanism(GRFS)for data flow in PCN,which specifically includes a communication system architecture for integrated terminals based on a cyclic queuing and forwarding(CQF)model and fault recovery method,which reduces the impact of faults by simplified scheduling configurations of CQF and fault-tolerance of prioritizing the rerouting of faulty time-sensitive(TS)flows;considering that PF leading to changes in network topology is more appropriately solved by doing routing and time slot injection decisions hop-by-hop,and that reasonable network load can reduce the damage caused by PF and reserve resources for the rerouting of faulty TS flows,an optimization model for joint routing and scheduling is constructed with scheduling success rate as the objective,and with traffic latency and network load as constraints;to catch changes in TSN topology and traffic load,a D3QN algorithm based on a multi-head graph attention residual network(MGAR)is designed to solve the problem model,where the MGAR based encoder reconstructs the TSN status into feature embedding vectors,and a dueling network decoder performs decoding tasks on the reconstructed feature embedding vectors.Simulation results show that GRFS outperforms heuristic fault-tolerance algorithms and other benchmark schemes by approximately 10%in routing and scheduling success rate in ideal states and 5%in rerouting and rescheduling success rate in fault states.展开更多
Mobile Edge Computing(MEC)is a technology designed for the on-demand provisioning of computing and storage services,strategically positioned close to users.In the MEC environment,frequently accessed content can be dep...Mobile Edge Computing(MEC)is a technology designed for the on-demand provisioning of computing and storage services,strategically positioned close to users.In the MEC environment,frequently accessed content can be deployed and cached on edge servers to optimize the efficiency of content delivery,ultimately enhancing the quality of the user experience.However,due to the typical placement of edge devices and nodes at the network’s periphery,these components may face various potential fault tolerance challenges,including network instability,device failures,and resource constraints.Considering the dynamic nature ofMEC,making high-quality content caching decisions for real-time mobile applications,especially those sensitive to latency,by effectively utilizing mobility information,continues to be a significant challenge.In response to this challenge,this paper introduces FT-MAACC,a mobility-aware caching solution grounded in multi-agent deep reinforcement learning and equipped with fault tolerance mechanisms.This approach comprehensively integrates content adaptivity algorithms to evaluate the priority of highly user-adaptive cached content.Furthermore,it relies on collaborative caching strategies based onmulti-agent deep reinforcement learningmodels and establishes a fault-tolerancemodel to ensure the system’s reliability,availability,and persistence.Empirical results unequivocally demonstrate that FTMAACC outperforms its peer methods in cache hit rates and transmission latency.展开更多
In this paper, the multisensor data fusion technique of a fault tolerant integrated navigation system is discussed. A neural approach for data fusion is proposed for multisensor integrated systems. The simulation res...In this paper, the multisensor data fusion technique of a fault tolerant integrated navigation system is discussed. A neural approach for data fusion is proposed for multisensor integrated systems. The simulation results show that this neural approach for data fusion is feasible.展开更多
The distributed system with high performance and stability is commonly adopted in large scale scientific and engineering computing. In this paper, we discuss a fault-tolerant mechanism under Linux circumstance to impr...The distributed system with high performance and stability is commonly adopted in large scale scientific and engineering computing. In this paper, we discuss a fault-tolerant mechanism under Linux circumstance to improve the fault-tolerant ability of the system, namely a scheme and frame to form the stable computing platform. In terms of the structure and function of the distributed system, active list and file invocation strategies are employed in the task management. System multilevel fault-tolerance can be achieved by repeated processes in a single node and task migration on multi-nodes. Manager node agent introduced in this paper administrates the nodes using the list, disposes of the tasks according to the nodes’ performance, and hence, to be able to make full use of the cluster resources. An evaluation method is proposed to appraise the performance. The analyzed results show the usefulness of the scheme proposed except for some additional overhead of memory consumption.展开更多
We propose a fault-tolerant tree-based multicast algorithm for 2-dimensional (2-D) meshes based on the concept of the extended safety level which is a vector associated with each node to capture fault information in t...We propose a fault-tolerant tree-based multicast algorithm for 2-dimensional (2-D) meshes based on the concept of the extended safety level which is a vector associated with each node to capture fault information in the neighborhood. In this approach each destination is reached through a minimum number of hops. In order to minimize the total number of traffic steps, three heuristic strategies are proposed. This approach can be easily implemented by pipelined circuit switching (PCS). A simulation study is conducted to measure the total number of traffic steps under different strategies. Our approach is the first attempt to address the fault- tolerant tree-based multicast problem in 2-D meshes based on limited global information with a simple model and succinct information.展开更多
In view of the current sensors failure in electric pitch system,a variable universe fuzzy fault tolerant control method of electric pitch control system based on single current detection is proposed.When there is sing...In view of the current sensors failure in electric pitch system,a variable universe fuzzy fault tolerant control method of electric pitch control system based on single current detection is proposed.When there is single or two-current sensor fault occurs,based on the proposed method the missing current information can be reconstructed by using direct current(DC)bus current sensor and the three-phase current can be updated in time within any two adjacent sampling periods,so as to ensure stability of the closed-loop system.And then the switchover and fault tolerant control of fault current sensor would be accomplished by fault diagnosis method based on adaptive threshold judgment.For the reconstructed signal error caused by the modulation method and the main control target of electric pitch system,a variable universe fuzzy control method is used in the speed loop,which can improve the anti-disturbance ability to load variation,and the robustness of fault tolerance system.The results show that the fault tolerant control method makes the variable pitch control system still has ideal control characteristics in case of sensor failure although part of the system performance is lost,thus the correctness of the proposed method is verified.展开更多
In this paper, a method of intelligent fault tolerant management on electromechanical equipment is presented. It is based on condition monitoring of equipment and realized by condition prediction and condition contro...In this paper, a method of intelligent fault tolerant management on electromechanical equipment is presented. It is based on condition monitoring of equipment and realized by condition prediction and condition control. An example is introduced and analyzed in this paper.展开更多
To handle the effects of single event upsets(SEU),which are common to computers in space radiation environment,a new fault-tolerant system with dual-module redundancy is proposed using dynamic reconfigurable techniq...To handle the effects of single event upsets(SEU),which are common to computers in space radiation environment,a new fault-tolerant system with dual-module redundancy is proposed using dynamic reconfigurable technique of field programmable gate array(FPGA). This system contains detection and backup alternative functions,that is,the self-detection and self-healing functions can be completed,and consequently a system design with low hardware redundancy and high resource utilization can be achieved successfully. So it can not only detect fault but also repair the fault effectively after failure. Hence,this method is especially practical to the dynamically reconfigurable computers based on FPGAs. Design methodology has been verified by Virtex-4 FPGA on Xilinx Ml403 development platform.展开更多
Single-chip multiprocessor (CMP) combined with the fault-loleranl(FT)techniques offers an ideal architecture to achieve high availability on the basis of sustaining highcomputing performance FT design of a single-chip...Single-chip multiprocessor (CMP) combined with the fault-loleranl(FT)techniques offers an ideal architecture to achieve high availability on the basis of sustaining highcomputing performance FT design of a single-chip multiprocessor is described, including thetechniques from hard-wart redundancy to software support and firmware strategy. The design aims atmasking the influences of errors and automatically correcting the system states.展开更多
An active fault tolerant control scheme is investigated for the attitude control systems of spacecraft with external disturbance and actuator faults by using the sliding mode technique. Firstly,the dynamic equations a...An active fault tolerant control scheme is investigated for the attitude control systems of spacecraft with external disturbance and actuator faults by using the sliding mode technique. Firstly,the dynamic equations and kinematic equations of spacecraft are given. For the dynamic mode of spacecraft in faulty case,a fault diagnosis component is used for fault detection and estimation by using a nonlinear observer. According to the fault estimation information obtained during the fault diagnosis,the fault tolerant control scheme is developed by adopting the backstepping sliding mode control technique. Meanwhile,the Lyapunov theory is used to analyze the stability of the closed-loop attitude systems. Finally,simulation results for the attitude dynamics models show the feasibility of the proposed fault tolerant scheme.展开更多
A new fault tolerant control(FTC) via a controller reconfiguration approach for general stochastic nonlinear systems is studied.Different from the formulation of classical FTC methods,it is supposed that the measure...A new fault tolerant control(FTC) via a controller reconfiguration approach for general stochastic nonlinear systems is studied.Different from the formulation of classical FTC methods,it is supposed that the measured information for the FTC is the probability density functions(PDFs) of the system output rather than its measured value.A radial basis functions(RBFs) neural network technique is proposed so that the output PDFs can be formulated in terms of the dynamic weighings of the RBFs neural network.As a result,the nonlinear FTC problem subject to dynamic relation between the input and the output PDFs can be transformed into a nonlinear FTC problem subject to dynamic relation between the control input and the weights of the RBFs neural network approximation to the output PDFs.The FTC design consists of two steps.The first step is fault detection and diagnosis(FDD),which can produce an alarm when there is a fault in the system and also locate which component has a fault.The second step is to adapt the controller to the faulty case so that the system is able to achieve its target.A linear matrix inequality(LMI) based feasible FTC method is applied such that the fault can be detected and diagnosed.An illustrated example is included to demonstrate the efficiency of the proposed algorithm,and satisfactory results have been obtained.展开更多
In this paper,a fault tolerant control with the consideration of actuator fault for a networked control system (NCS) with packet loss is addressed.The NCS with data packet loss can be described as a switched system ...In this paper,a fault tolerant control with the consideration of actuator fault for a networked control system (NCS) with packet loss is addressed.The NCS with data packet loss can be described as a switched system model.Packet loss dependent Lyapunov function is used and a fault tolerant controller is proposed respectively for arbitrary packet loss process and Markovian packet loss process.Considering a controlled plant with external energy-bounded disturbance,a robust H ∞ fault tolerant controller is designed for the NCS.These results are also expanded to the NCS with packet loss and networked-induced delay.Numerical examples are given to illustrate the effectiveness of the proposed design method.展开更多
A novel fault-tolerant adaptive control methodology against the actuator faults is proposed. The actuator effectiveness factors (AEFs) are introduced to denote the healthy of actuator, and the unscented Kalman filt...A novel fault-tolerant adaptive control methodology against the actuator faults is proposed. The actuator effectiveness factors (AEFs) are introduced to denote the healthy of actuator, and the unscented Kalman filter (UKF) is employed for online estimation of both the motion states and the AEFs of mobile robot. A square root version of the UKF is introduced to improve efficiency and numerical stability. Using the information from the UKF, the reconfigurable controller is designed automatically based on an enhancement inverse dynamic control (IDC) methodology. The experiment on a 3-DOF omni-directional mobile robot is performed, and the effectiveness of the proposed method is demonstrated.展开更多
文摘In order to improve the system reliability and performance and to reduce the system cost, volume and weight, we have designed, fabricated and tested the multibus adapter system of a trimodular redundant fault tolerant computer system on a single chip of 5000 gate CMOS gate array. The design, fabrication and test of this single chip system will be discussed..
文摘In recent years,the environment of railways and the systems such as CBTC(communication based train control)have been changing.To respond the changes and the needs of customers,a UTCS(unified train control system)has been developed to realize a system that evolves with customers.Previous type systems consist of independent components such as ATC(Automatic train control)system,electronic interlocking system,and facility monitoring system,and there are a complicated overlap of system configurations and functions and difference in concept between the systems.On the other hand,the integrated train control system consists of horizontal layers such as function layer,network layer,and terminal layer.Therefore,the system has been developed to make it simple with no unnecessary redundancy and evolving to meet the needs of customers.In this paper,we explain a method that realizes the interlocking function for CBTC system in the function layer based on the concept of“securing a train travelling path”including path blocking and routing,and evaluate the safety of the method using STAMP/STPA.
基金Acknowledgements This work was partially supported by National High-tech R&D Program of China (863 Program) (2012AA01A301, 2012AA010901), by Program for New Century Excellent Talents in University and by National Natural Science Foundation of China (Grant Nos. 61272142, 61103082, 61170261, and 61103193).
文摘With the increase of system scale, the inherent reliability of supercomputers becomes lower and lower. The cost of fault handling and task recovery increases so rapidly that the reliability issue will soon harm the usability of supercomputers. This issue is referred to as the "reliability wall", which is regarded as a critical problem for current and future supercomputers. To address this problem, we propose an autonomous fault-tolerant system, named Iaso, in MilkyWay- 2 system. Iaso introduces the concept of autonomous management in supercomputers. By autonomous management, the computer itself, rather than manpower, takes charge of the fault management work. Iaso automatically manage the whole lifecycle of faults, including fault detection, fault diagnosis, fault isolation, and task recovery. Iaso endows the autonomous features with MilkyWay-2 system, such as self-awareness, self-diagnosis, self-healing, and self-protection. With the help of Iaso, the cost of fault handling in supercomputers reduces from several hours to a few seconds. Iaso greatly improves the usability and reliability of MilkyWay-2 system.
文摘This paper describes an onboard computer with dual processing modules. Each processing module is composed of 32 bit ARM reduced instruction set computer processor and other commercial-off-the-shelf devices. A set of fault handling mechanisms is implemented in the computer system, which enables the system to tolerate a single fault. The onboard software is organized around a set of processes that communicate among each other through a routing process. Meeting an extremely tight set of constraints that include mass, volume, power consumption and space environmental conditions, the fault-tolerant onboard computer has excellent data processing capability that can meet the erquirements of micro-satellite missions.
基金Supported by National Natural Science Foundation of China (Grant No.51975294)Fundamental Research Funds for the Central Universities of China (Grant No.30922010706)。
文摘Effective fault diagnosis and fault-tolerant control method for aeronautics electromechanical actuator is concerned in this paper.By borrowing the advantages of model-driven and data-driven methods,a fault tolerant nonsingular terminal sliding mode control method based on support vector machine(SVM)is proposed.A SVM is designed to estimate the fault by off-line learning from small sample data with solving convex quadratic programming method and is introduced into a high-gain observer,so as to improve the state estimation and fault detection accuracy when the fault occurs.The state estimation value of the observer is used for state reconfiguration.A novel nonsingular terminal sliding mode surface is designed,and Lyapunov theorem is used to derive a parameter adaptation law and a control law.It is guaranteed that the proposed controller can achieve asymptotical stability which is superior to many advanced fault-tolerant controllers.In addition,the parameter estimation also can help to diagnose the system faults because the faults can be reflected by the parameters variation.Extensive comparative simulation and experimental results illustrate the effectiveness and advancement of the proposed controller compared with several other main-stream controllers.
基金Project supported by the Natural Science Foundation of Shandong Province,China (Grant No.ZR2021MF049)Joint Fund of Natural Science Foundation of Shandong Province (Grant Nos.ZR2022LLZ012 and ZR2021LLZ001)。
文摘Readout errors caused by measurement noise are a significant source of errors in quantum circuits,which severely affect the output results and are an urgent problem to be solved in noisy-intermediate scale quantum(NISQ)computing.In this paper,we use the bit-flip averaging(BFA)method to mitigate frequent readout errors in quantum generative adversarial networks(QGAN)for image generation,which simplifies the response matrix structure by averaging the qubits for each random bit-flip in advance,successfully solving problems with high cost of measurement for traditional error mitigation methods.Our experiments were simulated in Qiskit using the handwritten digit image recognition dataset under the BFA-based method,the Kullback-Leibler(KL)divergence of the generated images converges to 0.04,0.05,and 0.1 for readout error probabilities of p=0.01,p=0.05,and p=0.1,respectively.Additionally,by evaluating the fidelity of the quantum states representing the images,we observe average fidelity values of 0.97,0.96,and 0.95 for the three readout error probabilities,respectively.These results demonstrate the robustness of the model in mitigating readout errors and provide a highly fault tolerant mechanism for image generation models.
基金supported by Research and Application of Edge IoT Technology for Distributed New Energy Consumption in Distribution Areas,Project Number(5108-202218280A-2-394-XG)。
文摘For permanent faults(PF)in the power communication network(PCN),such as link interruptions,the timesensitive networking(TSN)relied on by PCN,typically employs spatial redundancy fault-tolerance methods to keep service stability and reliability,which often limits TSN scheduling performance in fault-free ideal states.So this paper proposes a graph attention residual network-based routing and fault-tolerant scheduling mechanism(GRFS)for data flow in PCN,which specifically includes a communication system architecture for integrated terminals based on a cyclic queuing and forwarding(CQF)model and fault recovery method,which reduces the impact of faults by simplified scheduling configurations of CQF and fault-tolerance of prioritizing the rerouting of faulty time-sensitive(TS)flows;considering that PF leading to changes in network topology is more appropriately solved by doing routing and time slot injection decisions hop-by-hop,and that reasonable network load can reduce the damage caused by PF and reserve resources for the rerouting of faulty TS flows,an optimization model for joint routing and scheduling is constructed with scheduling success rate as the objective,and with traffic latency and network load as constraints;to catch changes in TSN topology and traffic load,a D3QN algorithm based on a multi-head graph attention residual network(MGAR)is designed to solve the problem model,where the MGAR based encoder reconstructs the TSN status into feature embedding vectors,and a dueling network decoder performs decoding tasks on the reconstructed feature embedding vectors.Simulation results show that GRFS outperforms heuristic fault-tolerance algorithms and other benchmark schemes by approximately 10%in routing and scheduling success rate in ideal states and 5%in rerouting and rescheduling success rate in fault states.
基金supported by the Innovation Fund Project of Jiangxi Normal University(YJS2022065)the Domestic Visiting Program of Jiangxi Normal University.
文摘Mobile Edge Computing(MEC)is a technology designed for the on-demand provisioning of computing and storage services,strategically positioned close to users.In the MEC environment,frequently accessed content can be deployed and cached on edge servers to optimize the efficiency of content delivery,ultimately enhancing the quality of the user experience.However,due to the typical placement of edge devices and nodes at the network’s periphery,these components may face various potential fault tolerance challenges,including network instability,device failures,and resource constraints.Considering the dynamic nature ofMEC,making high-quality content caching decisions for real-time mobile applications,especially those sensitive to latency,by effectively utilizing mobility information,continues to be a significant challenge.In response to this challenge,this paper introduces FT-MAACC,a mobility-aware caching solution grounded in multi-agent deep reinforcement learning and equipped with fault tolerance mechanisms.This approach comprehensively integrates content adaptivity algorithms to evaluate the priority of highly user-adaptive cached content.Furthermore,it relies on collaborative caching strategies based onmulti-agent deep reinforcement learningmodels and establishes a fault-tolerancemodel to ensure the system’s reliability,availability,and persistence.Empirical results unequivocally demonstrate that FTMAACC outperforms its peer methods in cache hit rates and transmission latency.
文摘In this paper, the multisensor data fusion technique of a fault tolerant integrated navigation system is discussed. A neural approach for data fusion is proposed for multisensor integrated systems. The simulation results show that this neural approach for data fusion is feasible.
基金the Creative Research Team Foundation of the National Natural Science Foundation of China (No. 50221903)
文摘The distributed system with high performance and stability is commonly adopted in large scale scientific and engineering computing. In this paper, we discuss a fault-tolerant mechanism under Linux circumstance to improve the fault-tolerant ability of the system, namely a scheme and frame to form the stable computing platform. In terms of the structure and function of the distributed system, active list and file invocation strategies are employed in the task management. System multilevel fault-tolerance can be achieved by repeated processes in a single node and task migration on multi-nodes. Manager node agent introduced in this paper administrates the nodes using the list, disposes of the tasks according to the nodes’ performance, and hence, to be able to make full use of the cluster resources. An evaluation method is proposed to appraise the performance. The analyzed results show the usefulness of the scheme proposed except for some additional overhead of memory consumption.
基金NSF of USA under grant CCR 99O0646 and grant ANI 0073736.
文摘We propose a fault-tolerant tree-based multicast algorithm for 2-dimensional (2-D) meshes based on the concept of the extended safety level which is a vector associated with each node to capture fault information in the neighborhood. In this approach each destination is reached through a minimum number of hops. In order to minimize the total number of traffic steps, three heuristic strategies are proposed. This approach can be easily implemented by pipelined circuit switching (PCS). A simulation study is conducted to measure the total number of traffic steps under different strategies. Our approach is the first attempt to address the fault- tolerant tree-based multicast problem in 2-D meshes based on limited global information with a simple model and succinct information.
基金Natural Science Foundation of Gansu Province(Joint)Project(No.213244)Natural Science Foundation of Gansu Province(No.145RJZA136)Youth Science Foundation of Lanzhou Jiaotong University(No.2013040)
文摘In view of the current sensors failure in electric pitch system,a variable universe fuzzy fault tolerant control method of electric pitch control system based on single current detection is proposed.When there is single or two-current sensor fault occurs,based on the proposed method the missing current information can be reconstructed by using direct current(DC)bus current sensor and the three-phase current can be updated in time within any two adjacent sampling periods,so as to ensure stability of the closed-loop system.And then the switchover and fault tolerant control of fault current sensor would be accomplished by fault diagnosis method based on adaptive threshold judgment.For the reconstructed signal error caused by the modulation method and the main control target of electric pitch system,a variable universe fuzzy control method is used in the speed loop,which can improve the anti-disturbance ability to load variation,and the robustness of fault tolerance system.The results show that the fault tolerant control method makes the variable pitch control system still has ideal control characteristics in case of sensor failure although part of the system performance is lost,thus the correctness of the proposed method is verified.
文摘In this paper, a method of intelligent fault tolerant management on electromechanical equipment is presented. It is based on condition monitoring of equipment and realized by condition prediction and condition control. An example is introduced and analyzed in this paper.
基金supported by the National Natural Science Foundation of China under Grant No. 60971036the National High Technology Research and Development Program of China under Grant No. 2008AA01Z104+1 种基金the Fundamental Research Funds for the Central Universities under Grant No. ZYGX2009Z004the New Century Excellent Talents in University under Grant No. NCET-08-0087
文摘To handle the effects of single event upsets(SEU),which are common to computers in space radiation environment,a new fault-tolerant system with dual-module redundancy is proposed using dynamic reconfigurable technique of field programmable gate array(FPGA). This system contains detection and backup alternative functions,that is,the self-detection and self-healing functions can be completed,and consequently a system design with low hardware redundancy and high resource utilization can be achieved successfully. So it can not only detect fault but also repair the fault effectively after failure. Hence,this method is especially practical to the dynamically reconfigurable computers based on FPGAs. Design methodology has been verified by Virtex-4 FPGA on Xilinx Ml403 development platform.
基金Supported by the National High Techology Devel opment 863 Program of China(2002AA1Z030) and China PostdoctoralScience Foundation(2003034151)
文摘Single-chip multiprocessor (CMP) combined with the fault-loleranl(FT)techniques offers an ideal architecture to achieve high availability on the basis of sustaining highcomputing performance FT design of a single-chip multiprocessor is described, including thetechniques from hard-wart redundancy to software support and firmware strategy. The design aims atmasking the influences of errors and automatically correcting the system states.
基金partially supported by the National Natural Science Foundation of China(No. 61473143)Postgraduate Research & Practice Innovation Program of Jiangsu Province(No. KYCX18_0299)the China Scholarships Council(No. 201806830102)
文摘An active fault tolerant control scheme is investigated for the attitude control systems of spacecraft with external disturbance and actuator faults by using the sliding mode technique. Firstly,the dynamic equations and kinematic equations of spacecraft are given. For the dynamic mode of spacecraft in faulty case,a fault diagnosis component is used for fault detection and estimation by using a nonlinear observer. According to the fault estimation information obtained during the fault diagnosis,the fault tolerant control scheme is developed by adopting the backstepping sliding mode control technique. Meanwhile,the Lyapunov theory is used to analyze the stability of the closed-loop attitude systems. Finally,simulation results for the attitude dynamics models show the feasibility of the proposed fault tolerant scheme.
基金supported by the UK Leverhulme Trust (F/00 120/BC)the National Natural Science Foundation of China (6082800760974029)
文摘A new fault tolerant control(FTC) via a controller reconfiguration approach for general stochastic nonlinear systems is studied.Different from the formulation of classical FTC methods,it is supposed that the measured information for the FTC is the probability density functions(PDFs) of the system output rather than its measured value.A radial basis functions(RBFs) neural network technique is proposed so that the output PDFs can be formulated in terms of the dynamic weighings of the RBFs neural network.As a result,the nonlinear FTC problem subject to dynamic relation between the input and the output PDFs can be transformed into a nonlinear FTC problem subject to dynamic relation between the control input and the weights of the RBFs neural network approximation to the output PDFs.The FTC design consists of two steps.The first step is fault detection and diagnosis(FDD),which can produce an alarm when there is a fault in the system and also locate which component has a fault.The second step is to adapt the controller to the faulty case so that the system is able to achieve its target.A linear matrix inequality(LMI) based feasible FTC method is applied such that the fault can be detected and diagnosed.An illustrated example is included to demonstrate the efficiency of the proposed algorithm,and satisfactory results have been obtained.
基金supported by National Natural Science Foundation of China (No. 60874052)
文摘In this paper,a fault tolerant control with the consideration of actuator fault for a networked control system (NCS) with packet loss is addressed.The NCS with data packet loss can be described as a switched system model.Packet loss dependent Lyapunov function is used and a fault tolerant controller is proposed respectively for arbitrary packet loss process and Markovian packet loss process.Considering a controlled plant with external energy-bounded disturbance,a robust H ∞ fault tolerant controller is designed for the NCS.These results are also expanded to the NCS with packet loss and networked-induced delay.Numerical examples are given to illustrate the effectiveness of the proposed design method.
基金This project is supported by National Hi-tech Research and Development Program of China (863 Program, No. 2003AA421020).
文摘A novel fault-tolerant adaptive control methodology against the actuator faults is proposed. The actuator effectiveness factors (AEFs) are introduced to denote the healthy of actuator, and the unscented Kalman filter (UKF) is employed for online estimation of both the motion states and the AEFs of mobile robot. A square root version of the UKF is introduced to improve efficiency and numerical stability. Using the information from the UKF, the reconfigurable controller is designed automatically based on an enhancement inverse dynamic control (IDC) methodology. The experiment on a 3-DOF omni-directional mobile robot is performed, and the effectiveness of the proposed method is demonstrated.