期刊文献+

一种星载计算机数据流软故障纠正算法 被引量:7

A Software-Based Method for "Soft Error" Correction in Space Computers
下载PDF
导出
摘要 在太空环境中,由于宇宙射线的存在,计算机系统的存储单元经常发生各种瞬态故障。此类故障通常都使用硬件或从系统角度加以解决,但其成本高重量大。针对此种故障类型提出了一种软件实现的数据流故障纠正算法,该算法通过对程序中变量进行简单编码和解码操作后,可对发生在程序数据空间内的单“位”错误进行检测并进而纠正。故障注入的实验结果表明,对于程序数据段错误,该算法可把错误输出从原始程序的27%~49%降低到0.01%~0.02%,同时故障纠正率接近100%;对于程序堆栈段错误,该算法可把错误输出从原始程序的10%~70%降低到1%~3%,故障纠正率也在73%以上。与其它软件实现的软故障检测或纠正算法相比,实验结果表明该算法实现简单,运算量小,具有较高的错误探测与纠正能力。 Computer systems operating in space environment are subject to different radiation phenomena, whose effects are often called "Soft Error". Generally, the radiation-hardened chips are used to break through these errors, but their costs are expensive and their performances are always lower than their counterparts. In this paper, a software-based approach for soft error correction is put forward. The technique is based on the coding and decoding of variables in programs to detect and correct the errors in them. By applying the proposed technique, several benchmark applications have been hardened against transient errors. Fault injection campaigns have been performed to evaluate the fault detection and correction capability of the proposed technique in com- parison with state-of-the-art alternative methods. Experimental results show that the proposed approach is far more effective than the other considered techniques in terms of fault correction capability, at the cost of a limited increase in memory requirements and in performance overhead.
出处 《宇航学报》 EI CAS CSCD 北大核心 2007年第4期1044-1048,共5页 Journal of Astronautics
基金 航天"十五"预研项目"星上弹上容错计算机系统可靠性评测技术的研究"资助(417010402)
关键词 软故障 数据容错 并行错误纠正 星载计算机 Soft error Data error tolerance Concurrent error correction Single event upset
  • 相关文献

参考文献13

  • 1王长龙,沈石岑,张传军.星载设备抗单粒子效应的设计技术初探[J].航天控制,1995,13(3):24-30. 被引量:4
  • 2王同权,戴宏毅,沈永平,张若棋,肖亚斌.宇宙高能质子致单粒子翻转率的计算[J].国防科技大学学报,2002,24(2):11-13. 被引量:11
  • 3Dhillon Y S,Diril A U,Chatterjee A.Soft-error tolerance analysis and optimization of nanometer circuits[C]// Proceedings of the Design,Automation and Test in Europe Conference and Exhibition(DATE'05),Mu-nich,Germany:IEEE Computer Society,2005:288-293
  • 4Hazucha P.Measurements and analysis of SER-tolerant latch in a 90-nm dual vt CMOS process[J].IEEE J,Solid State Cir-cuits,2004:1536-1543
  • 5Dupont E,Nicolaidis M,Rohr P.Embedded robustness ips for transient-error-free ICs[J].IEEE Des.Test Comput,2002:56-70
  • 6Oh N,Shirvani P,McCluskey E.Error de-tection by duplicated instructions in super-scalar processors[J].IEEE Transactions on Reliability,2002,51(1):63-75
  • 7Nicolescu B,Velazco R.Detecting soft er-rors by a purely software approach:method,5 tools and experimental results[C]// Pro-ceedings of the Design,Automation and Test in Europe Conference and Exihibition (DATE'03),Munich,Germany:IEEE Com-puter Society,2003:20057-20063
  • 8Alkhalifa Z,Nair V,krishnamurthy N,et al.Design and evaluation of systemLevel checks for on-line control flow error de-tection[J].IEEE Transactions Parallel and Distributed Systems,1999,10(6):627-641
  • 9Oh N,Shirvani P,McCluskey E.Control-flow checking by software signatures[J].IEEE Transactions on Reliability,2002,51(2):111-122
  • 10Mukherjee S S,Kontz M,Reinhardt S.Detailed design and evaluation of redun-dant multithreading alternatives[C]// Proc.Int'l Symp,Computer Architecture,IEEE CS Press,2002:99-110

二级参考文献12

  • 1都亨等.中国空间科学进展[M].北京:国防工业出版社,1995.
  • 2Petersen E L.Prediction and Observations of SEU Rates in Space[J]. IEEE Trans. Nucl. Sci., 1997, 44(6):2174-2187.
  • 3薛晓东.宇宙空间高能质子输运及其辐射效应的数值模拟研究[D].国防科技大学研究生院,2001.
  • 4Adams J H Jr,et al. Cosmic Ray Effects on Microelectronics[J]. IEEE Trans. Nucl. Sci. 1982, 29(1):169-171.
  • 5Benso A, Di Carlo S, Di Natale G and Pfinetto P. Static analysis of SEU effects on software applications[C]. IEEE International Test Conference, 2002 : 500 - 508.
  • 6Goloubeva O, Rebaudengo M, Sonza Reorda M and Violante M. Softerror detection using control flow assertions [ C ]. IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems( DFT' 03),2003:581 - 588.
  • 7Bagchi S, Kalbarczyk Z, lyer R and Levendel Y. Hierarchical error detection in a SIFT environment[D]. PhD Thesis, U of Illinois, 2001.
  • 8Saxena N R and McCluskey E J. Control-flow checking using watchdog assists and extended-precision checksums [ J]. IEEE Transactions on Computers, 1990, 39(4) :554 - 559.
  • 9Alkhalifa Z, Nair V S S, Krishnamurthy N and Abraham J A. Design and evaluation of system-level checks for on-line control now error detection[ J ]. IEEE Transactions Parallel and Distributed Systems,1999,10(6) :627 - 641.
  • 10Oh N, Shirvani P P and McCluskey E J. Control-flow checking by software signatures [ J ]. IEEE Transactions on Reliability, 2002, 51 (2):111 - 122.

共引文献20

同被引文献66

引证文献7

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部