With the increase of link rate, the arbitrator of centralized switch fabric becomes too complicated to implement. A parallel switch fabric based on crossbar, named as PSFBC (Parallel Switch Fabric Based on Crossbar), ...With the increase of link rate, the arbitrator of centralized switch fabric becomes too complicated to implement. A parallel switch fabric based on crossbar, named as PSFBC (Parallel Switch Fabric Based on Crossbar), has been proposed in this paper. PSFBC is composed of k switches whose rate is 1/k of link', these switches exchange cells in parallel; this increases the arbitrator's period and make it easy to implement. Load is evenly distributed to each switch with FCFS (First Come First Serve) rule, it can keep the order of cells in one stream. A multi-class queue scheduling policy is used in PSFBC to ensure the quality of realtime streams. Experiments show that the load on each switch in PSFBC is well balanced, its average delay of cells is little and its performance is very close to centrali{ed switch; and with the increase of number of parallel switches, the loss of PSFBC's performance keeps very small, it becomes easier to implement.展开更多
The fast growth of Internet has cre-ated the need for high-speed switches. Re-cently, the crosspoint-queue switch has at-tracted attention because of its scalability and high performance. However, the Cross-point-Queu...The fast growth of Internet has cre-ated the need for high-speed switches. Re-cently, the crosspoint-queue switch has at-tracted attention because of its scalability and high performance. However, the Cross-point-Queue switch does not perform well under non-uniform traffic. To overcome this limitation, the Load-Balanced Cross-point-Queued switch architecture has been proposed. In this architecture, a load-balance stage is placed ahead of the Cross-point-Queued stage. The load-balance stage transforms the incoming non-uniform traffic into nearly uniform traffic at the input port of the second stage. To avoid out-of-order cells, this stage employs flow-based queues in each crosspoint buffer. Analysis and simulation results reveal that under non-uniform traffic, this new switch architecture achieves a delay performance similar to that of the Out-put-Queued switch without the need for inter- nal acceleration. In addition, its throughput is much better than that of the pure cross- point-queued switch. Finally, it can achieve the same packet loss rate as the cross- point-queue switch, while using a buffer size that is only 65% of that used by the cross- point-queue switch.展开更多
This paper analyses the performance of the ATM switch fabric with Combined-Input/ Output Buffering(C-IOB) under two different service principles for the cells at the head of line (HOL) positions of input buffers: Firs...This paper analyses the performance of the ATM switch fabric with Combined-Input/ Output Buffering(C-IOB) under two different service principles for the cells at the head of line (HOL) positions of input buffers: First Come First Service (FCFS)/Random Service(RS) for the set of HOL cells addressed to a given output port with different/same "age" (the waiting time at the HOL position) and Pure Random Service(PRS) for all HOL cells addressed to a given output port regardless of their "ages" while the Queue Loss (QL) transfer scheme is adopted for interaction between input and output buffers in the ATM switch fabric. The results obtained show that the C-IOB ATM switch fabric with PRS service policy and the QL transfer scheme is better than other buffering ATM switch fabrics.展开更多
To solve the bottleneck issue of high-speed routers and switch fabrics technology for current networks, this paper provides a detailed description of how to implement a shared-memory switch fabric for packet routers b...To solve the bottleneck issue of high-speed routers and switch fabrics technology for current networks, this paper provides a detailed description of how to implement a shared-memory switch fabric for packet routers by Xilinx FPGAs. The switch fabric, coined SL64, has the notable features as follows: (1) Supporting up to 16 OC-48 line cards; (2) Guaranteeing data lossless through robust flow control; (3) Supporting up to eight priorities with one strict priority; (4) Guaranteeing QoS, high throughput and starvation free through scheduling algorithm WF2Q+; (5) Provideing programmable cell length-64 bytes, 72 bytes and 80 bytes are all available for various applications.In addition, common features such as multicast support, protocol agnostic cell based switch, CRC check for cell header and embedded shared memory are also included in our switch fabric SL64.展开更多
We experimentally demonstrate a 4 × 4 nonblocking silicon thermo-optic(TO) switch fabric consisting of three stages of tunable generalized Mach–Zehnder interferometers. All 24 routing states for nonblocking swit...We experimentally demonstrate a 4 × 4 nonblocking silicon thermo-optic(TO) switch fabric consisting of three stages of tunable generalized Mach–Zehnder interferometers. All 24 routing states for nonblocking switching are characterized. The device's footprint is 4.6 mm × 1.0 mm. Measurements show that the worst cross talk of all switching states is-7.2 dB. The on-chip insertion loss is in the range of 3.7–13.1 dB. The average TO switching power consumption is 104.8 mW.展开更多
A new wavelength reservation scheme is proposed the reconfiguration times of optical cross-connects (OXCs) for we consider the reconfiguration information of switch fabrics to mitigate the connection setup time and ...A new wavelength reservation scheme is proposed the reconfiguration times of optical cross-connects (OXCs) for we consider the reconfiguration information of switch fabrics to mitigate the connection setup time and minimize WDM optical networks in this study. In this scheme, in the signaling protocol, which is designated as the signaling with switch fabric status (SWFS). Distributed reservation algorithms will reserve the wavelength with minimum of reconfiguration times of OXCs along the route. Simulation results indicate that the proposed schemes with switch fabrics status have shorter setup time, lower switching ratio as well as better blocking performance than those of the traditional classic schemes. Moreover, the proposed schemes with SWFS significantly reduce the number of switch fabrics that need to be reconfigured.展开更多
The harsh space radiation environment compromises the reliability of an on-board switching fabric by leading to cross-point and switching element(SE)faults.Different from traditional faulttolerant switching fabrics on...The harsh space radiation environment compromises the reliability of an on-board switching fabric by leading to cross-point and switching element(SE)faults.Different from traditional faulttolerant switching fabrics only taking crosspoint faults into account,a novel Input and Output Parallel Clos network,referred to as the(p_1,p_2)-IOPClos,is proposed to tolerate both cross-point and SE faults.In the(p_1,p_2)-IOPClos,there are p_1 and p_2 expanded parallel switching planes in the input and output stages,respectively.The multiple input/output switching planes are interconnected through the middle stage to provide multiple paths in each stage by which the network throughput can be increased remarkably.Furthermore,the network reliability of the(p_1,p_2)-IOPClos under the above both kinds of faults is analyzed.The corresponding implementation cost is also presented along with the network size.Both theoretical analysis and numerical results indicate that the(p_1,p_2)-IOPClos outperforms traditional Clos-type networks at reliability,while has less implementation cost than the multi-plane Clos network.展开更多
Current MSM switching fabric has poor performance under unbalanced traffic. This paper presents an alternative, novel Central-stage Buffered Three-stage Clos switching (CB-3Clos) fabric and proves that this fabric can...Current MSM switching fabric has poor performance under unbalanced traffic. This paper presents an alternative, novel Central-stage Buffered Three-stage Clos switching (CB-3Clos) fabric and proves that this fabric can emulate output queuing switch without any speedup. By analyzing the condition to satisfy the central-stage load-balance, this paper also proposes a Central-stage Load-balanced-based Distributed Scheduling algorithm (CLDS) for CB-3Clos. The results show that, compared with Concurrent Round-Robin based Dispatching (CRRD) algorithm based on MSM, CLDS algorithm has high throughput irrespective with the traffic model and better performance in mean packet delay.展开更多
In this paper, an Independent Window-Access(IWA) scheme is proposed, and the performance of an input-buffered ATM switching fabric with the IWA scheme is analysed by means of a probability generating function approach...In this paper, an Independent Window-Access(IWA) scheme is proposed, and the performance of an input-buffered ATM switching fabric with the IWA scheme is analysed by means of a probability generating function approach, the closed formulas of the average cell delay and the maximum throughput are given, and results show that the IWA scheme makes the switching fabric have better performances than traditional window-access scheme. The computer simulation results are in good agreement with these analytical results.展开更多
In this paper,the functions and the architecture of an ATM exchange are summarized.The switching fabrics,the line interface module and the control module are discussed in de-tail.An ATM switching system that adopts th...In this paper,the functions and the architecture of an ATM exchange are summarized.The switching fabrics,the line interface module and the control module are discussed in de-tail.An ATM switching system that adopts the scheme described in this paper has been de-veloped.展开更多
Most high-end switches use an input-queued or a combined input- and output-queued architecture. The switch fabrics of these architectures commonly use an iterative scheduling system such as iSLIP. Iterative schedulers...Most high-end switches use an input-queued or a combined input- and output-queued architecture. The switch fabrics of these architectures commonly use an iterative scheduling system such as iSLIP. Iterative schedulers are not very scalable and can be slow. We propose a new scheduling algorithm that finds a maximum matching of a modified I/O mapping graph in a single iteration (hence noniterative). Analytically and experimentally, we show that it provides full throughput and incurs very low delay;it is fair and of low complexity;and it outperforms traditional iterative schedulers. We also propose two switch architectures suited for this scheduling scheme and analyze their hardware implementations. The arbiter circuit is simple, implementing only a FIFO queue. Only half as many arbiters for an iterative scheme are needed. The arbiters operate in complete parallel. They work for both architectures and make the hardware implementations sim-ple. The first architecture uses conventional queuing structure and crossbar. The second one uses separate memories for each queue at an input port and a special crossbar. This crossbar is simple and also has a re-duced diameter and distributed structure. We also show that the architectures have good scalability and re-quire almost no speedup.展开更多
调度策略是核心路由交换设备性能的重要保证.针对联合输入交叉节点排队(combined input and cross-point queuing,简称CICQ)交换结构现有调度策略在复杂度或性能方面存在的缺陷,深入探讨了CICQ交换结构调度策略设计的基本准则,并提出了C...调度策略是核心路由交换设备性能的重要保证.针对联合输入交叉节点排队(combined input and cross-point queuing,简称CICQ)交换结构现有调度策略在复杂度或性能方面存在的缺陷,深入探讨了CICQ交换结构调度策略设计的基本准则,并提出了CICQ下虚拟通道的概念.基于基本准则和虚拟通道概念,提出一种简单、高效和公平服务的动态轮询调度策略——FDR(fair service and dynamic round robin).其算法复杂度为O(1),具有良好的可扩展性;并依据虚拟通道的状态为其分配调度份额,具有良好的动态实时性能,能够适应流量负载非均衡的网络环境.SPES(switching performance evaluation systcm)仿真结果表明,该算法具有良好的时延、吞吐量和抗突发性能.展开更多
文摘With the increase of link rate, the arbitrator of centralized switch fabric becomes too complicated to implement. A parallel switch fabric based on crossbar, named as PSFBC (Parallel Switch Fabric Based on Crossbar), has been proposed in this paper. PSFBC is composed of k switches whose rate is 1/k of link', these switches exchange cells in parallel; this increases the arbitrator's period and make it easy to implement. Load is evenly distributed to each switch with FCFS (First Come First Serve) rule, it can keep the order of cells in one stream. A multi-class queue scheduling policy is used in PSFBC to ensure the quality of realtime streams. Experiments show that the load on each switch in PSFBC is well balanced, its average delay of cells is little and its performance is very close to centrali{ed switch; and with the increase of number of parallel switches, the loss of PSFBC's performance keeps very small, it becomes easier to implement.
文摘The fast growth of Internet has cre-ated the need for high-speed switches. Re-cently, the crosspoint-queue switch has at-tracted attention because of its scalability and high performance. However, the Cross-point-Queue switch does not perform well under non-uniform traffic. To overcome this limitation, the Load-Balanced Cross-point-Queued switch architecture has been proposed. In this architecture, a load-balance stage is placed ahead of the Cross-point-Queued stage. The load-balance stage transforms the incoming non-uniform traffic into nearly uniform traffic at the input port of the second stage. To avoid out-of-order cells, this stage employs flow-based queues in each crosspoint buffer. Analysis and simulation results reveal that under non-uniform traffic, this new switch architecture achieves a delay performance similar to that of the Out-put-Queued switch without the need for inter- nal acceleration. In addition, its throughput is much better than that of the pure cross- point-queued switch. Finally, it can achieve the same packet loss rate as the cross- point-queue switch, while using a buffer size that is only 65% of that used by the cross- point-queue switch.
文摘This paper analyses the performance of the ATM switch fabric with Combined-Input/ Output Buffering(C-IOB) under two different service principles for the cells at the head of line (HOL) positions of input buffers: First Come First Service (FCFS)/Random Service(RS) for the set of HOL cells addressed to a given output port with different/same "age" (the waiting time at the HOL position) and Pure Random Service(PRS) for all HOL cells addressed to a given output port regardless of their "ages" while the Queue Loss (QL) transfer scheme is adopted for interaction between input and output buffers in the ATM switch fabric. The results obtained show that the C-IOB ATM switch fabric with PRS service policy and the QL transfer scheme is better than other buffering ATM switch fabrics.
文摘To solve the bottleneck issue of high-speed routers and switch fabrics technology for current networks, this paper provides a detailed description of how to implement a shared-memory switch fabric for packet routers by Xilinx FPGAs. The switch fabric, coined SL64, has the notable features as follows: (1) Supporting up to 16 OC-48 line cards; (2) Guaranteeing data lossless through robust flow control; (3) Supporting up to eight priorities with one strict priority; (4) Guaranteeing QoS, high throughput and starvation free through scheduling algorithm WF2Q+; (5) Provideing programmable cell length-64 bytes, 72 bytes and 80 bytes are all available for various applications.In addition, common features such as multicast support, protocol agnostic cell based switch, CRC check for cell header and embedded shared memory are also included in our switch fabric SL64.
基金supported in part by the 863 program (2013AA014402)the National Natural Science Foundation of China (NSFC) (61422508)the Science and Technology Commission of Shanghai Municipality (STCSM) Project (14QA1402600)
文摘We experimentally demonstrate a 4 × 4 nonblocking silicon thermo-optic(TO) switch fabric consisting of three stages of tunable generalized Mach–Zehnder interferometers. All 24 routing states for nonblocking switching are characterized. The device's footprint is 4.6 mm × 1.0 mm. Measurements show that the worst cross talk of all switching states is-7.2 dB. The on-chip insertion loss is in the range of 3.7–13.1 dB. The average TO switching power consumption is 104.8 mW.
基金the National Natural Science Foundation of China (Nos. 60632010 and 60572029)the National High Technology Research and Development Program (863) of China (No. 2006AA01Z251)
文摘A new wavelength reservation scheme is proposed the reconfiguration times of optical cross-connects (OXCs) for we consider the reconfiguration information of switch fabrics to mitigate the connection setup time and minimize WDM optical networks in this study. In this scheme, in the signaling protocol, which is designated as the signaling with switch fabric status (SWFS). Distributed reservation algorithms will reserve the wavelength with minimum of reconfiguration times of OXCs along the route. Simulation results indicate that the proposed schemes with switch fabrics status have shorter setup time, lower switching ratio as well as better blocking performance than those of the traditional classic schemes. Moreover, the proposed schemes with SWFS significantly reduce the number of switch fabrics that need to be reconfigured.
基金supported by the National Natural Science Foundation of China(91338108,91438206)
文摘The harsh space radiation environment compromises the reliability of an on-board switching fabric by leading to cross-point and switching element(SE)faults.Different from traditional faulttolerant switching fabrics only taking crosspoint faults into account,a novel Input and Output Parallel Clos network,referred to as the(p_1,p_2)-IOPClos,is proposed to tolerate both cross-point and SE faults.In the(p_1,p_2)-IOPClos,there are p_1 and p_2 expanded parallel switching planes in the input and output stages,respectively.The multiple input/output switching planes are interconnected through the middle stage to provide multiple paths in each stage by which the network throughput can be increased remarkably.Furthermore,the network reliability of the(p_1,p_2)-IOPClos under the above both kinds of faults is analyzed.The corresponding implementation cost is also presented along with the network size.Both theoretical analysis and numerical results indicate that the(p_1,p_2)-IOPClos outperforms traditional Clos-type networks at reliability,while has less implementation cost than the multi-plane Clos network.
基金Funded by the National Basic Research Program of China (No.2007CB307102)National High Tech Research and Development Program of China (No.2005AA121210)National Natural Science Foundation of China (No. 60572042)
文摘Current MSM switching fabric has poor performance under unbalanced traffic. This paper presents an alternative, novel Central-stage Buffered Three-stage Clos switching (CB-3Clos) fabric and proves that this fabric can emulate output queuing switch without any speedup. By analyzing the condition to satisfy the central-stage load-balance, this paper also proposes a Central-stage Load-balanced-based Distributed Scheduling algorithm (CLDS) for CB-3Clos. The results show that, compared with Concurrent Round-Robin based Dispatching (CRRD) algorithm based on MSM, CLDS algorithm has high throughput irrespective with the traffic model and better performance in mean packet delay.
文摘In this paper, an Independent Window-Access(IWA) scheme is proposed, and the performance of an input-buffered ATM switching fabric with the IWA scheme is analysed by means of a probability generating function approach, the closed formulas of the average cell delay and the maximum throughput are given, and results show that the IWA scheme makes the switching fabric have better performances than traditional window-access scheme. The computer simulation results are in good agreement with these analytical results.
基金the High Technology Research and Development Programme of china.
文摘In this paper,the functions and the architecture of an ATM exchange are summarized.The switching fabrics,the line interface module and the control module are discussed in de-tail.An ATM switching system that adopts the scheme described in this paper has been de-veloped.
文摘Most high-end switches use an input-queued or a combined input- and output-queued architecture. The switch fabrics of these architectures commonly use an iterative scheduling system such as iSLIP. Iterative schedulers are not very scalable and can be slow. We propose a new scheduling algorithm that finds a maximum matching of a modified I/O mapping graph in a single iteration (hence noniterative). Analytically and experimentally, we show that it provides full throughput and incurs very low delay;it is fair and of low complexity;and it outperforms traditional iterative schedulers. We also propose two switch architectures suited for this scheduling scheme and analyze their hardware implementations. The arbiter circuit is simple, implementing only a FIFO queue. Only half as many arbiters for an iterative scheme are needed. The arbiters operate in complete parallel. They work for both architectures and make the hardware implementations sim-ple. The first architecture uses conventional queuing structure and crossbar. The second one uses separate memories for each queue at an input port and a special crossbar. This crossbar is simple and also has a re-duced diameter and distributed structure. We also show that the architectures have good scalability and re-quire almost no speedup.
基金the National Natural Science Foundation of China under Grant No.60572042(国家自然科学基金)the National High-Tech Research and Development Plan of China under Grant No.2005AA121210(国家高技术研究发展计划(863))the National Basic Research Program of China under Grant No.2007CB307102(国家重点基础研究发展计划(973))
文摘调度策略是核心路由交换设备性能的重要保证.针对联合输入交叉节点排队(combined input and cross-point queuing,简称CICQ)交换结构现有调度策略在复杂度或性能方面存在的缺陷,深入探讨了CICQ交换结构调度策略设计的基本准则,并提出了CICQ下虚拟通道的概念.基于基本准则和虚拟通道概念,提出一种简单、高效和公平服务的动态轮询调度策略——FDR(fair service and dynamic round robin).其算法复杂度为O(1),具有良好的可扩展性;并依据虚拟通道的状态为其分配调度份额,具有良好的动态实时性能,能够适应流量负载非均衡的网络环境.SPES(switching performance evaluation systcm)仿真结果表明,该算法具有良好的时延、吞吐量和抗突发性能.