期刊文献+

死锁恢复的多维交换结构容错路由算法

Deadlock Recovery-Based Fault Tolerant Routing Algorithm for Multi-Dimensional Switching Fabric
下载PDF
导出
摘要 在高性能路由器中采用多维交换结构是解决可扩展性的一种方法。在实现这种交换结构时,随着节点数目的增多,交换结构出现故障的概率也随之增加。该文在mesh/torus结构上提出了一种基于死锁恢复策略的容错路由算法MMAR。基于各非故障节点周围链路的状态,MMAR能容错任意形状的故障模型且所需虚拟通道数少。通过在凹形区域表面节点中设置该凹形区域内节点位置信息表,该算法能避免消息进入与其目的节点无关的凹形区域以使绕道路径最短。该文给出了在256个节点的二维torus中的仿真结果,验证了算法的有效性。 Scalable switching fabrics can be used to implement high performance routers by employing multi-dimensional switching fabrics. But the fault probability of switching fabric also increases with the increase of components. A novel fault-tolerant algorithm on the mesh/toms, named as minimal misrouted adaptive routing (MMAR), is proposed based on deadlock recovery mechanism. According to the status of links around each fault-free node, MMAR can accommodate arbitrary shaped fault models using minimal number of virtual channels. When encountering concave fault models, through setting the position tables for the nodes within the concave regions at the surface nodes of each concave region, MMAR can minimize the length of the misrouted paths by avoiding routing the message into the irrespective holes. Performance results of a simulation on torus with 256 nodes are also given.
出处 《电子科技大学学报》 EI CAS CSCD 北大核心 2008年第6期844-847,854,共5页 Journal of University of Electronic Science and Technology of China
基金 国家自然科学基金(60372011)
关键词 死锁恢复 故障模型 容错路由算法 多维交换结构 deadlock recovery fault model fault-tolerant routing algorithm multi-dimensional switching fabric
  • 相关文献

参考文献10

  • 1GU Hua-xi, SHEN Hong, LIU Zeng-ji, et al. A new routing method to tolerate both convex and concave faulty regions in mesh/torus networks[C]//Proceedings of the 6th International Conference on Parallel and Distributed Computing, Applications and Technologies. Dalian China: IEEE Computer Society, 2005.
  • 2DUATO J, YALAMANCHILI S, NI L. Interconnection networks: an engineering approach [M]. (revised edition). San Francisco: Morgan Kaufmann, 2002.
  • 3MARTINEZ J M, LOPEZ P, DUATO J. A cost-effective approach to deadlock handling in wormhole networks [J]. IEEE Transactions on Parallel and Distributed Systems, 2001, 12(7): 716-729.
  • 4PINKSTON T M. On deadlocks in interconnection networks[C]// Proceedings of the Int'l Symposium on Computer Architecture. Colorado USA: ACM Press, 1997: 38-49.
  • 5BAYDAL E, LOPEZ P, DUATO J. A family of mechanisms for congestion control in wormhole networks[J]. IEEE Transactions on Parallel and Distributed Systems, 2005, 16(9): 772-784.
  • 6HO C T, STOCKMEYER L. A new approach to fault-tolerant wormhole routing for mesh-connected parallel computers[J]. IEEE Trans on Computers, 2004, 53(4): 427-439.
  • 7GOMEZ M E, NORDBOTTEN N A, FLICH J, et al. A routing methodology for achieving fault tolerance in direct networks [J]. IEEE Transaction on Computers, 2006, 55(4): 400-415.
  • 8CHALASANI S, BOPPANA R V. Communication in multi-computers with non-convex faults[J]. IEEE Transactions on Computers, 1997, 46(5): 616-622.
  • 9KHONSARI A, FARAHANI A. Disha: a performance model of a true fully adaptive routing algorithm in k-ary n-cubes[C]//Proceedings of the 10th IEEE International Symposium on Modeling, Analysis, Simulation of Computer and Telecommunication Systems. Texas USA: IEEE Computer Society, 2002: 183-190.
  • 10RUBIO J M, LOPEZ P, DUATO J. FC3D: flow control-based distributed deadlock detection mechanism for true fully adaptive routing in wormhole networks[J]. IEEE Transactions on Parallel and Distributed systems, 2003, 14(8): 765-778.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部