The processing speed of the communication between nodes in a parallel processor has become the major bottleneck of the processor's performance.RDMA(Remote Direct Memory Access) technology has drawn more attention ...The processing speed of the communication between nodes in a parallel processor has become the major bottleneck of the processor's performance.RDMA(Remote Direct Memory Access) technology has drawn more attention recently due to its capability of transferring a larger amount of data, higher speed and reliability.4DSP(4 Digital Signal Processing) module comprised of Tiger-SHARC201 chip is connected by LVDS(Low Voltage Differential Signal) circuits.This paper proposes a general and reconfigurable RDMA platform and its corresponding communication protocol with all the routes linked based on the zero copy.The protocol transfers message of DSP by interrupting of DMA and is applied on massive remote image impression, which reduces memory needs and working burden of CPU.The experiment results show this platform is efficient, flexible, and expandable of being integrated to a larger scale in the next development stages.展开更多
基金Supported by the NSFC (National Natural Science Foundation of China)the 863 Program (2006AA1332)ERIPKU, the Program for New Century Excellent Talents in University.
文摘The processing speed of the communication between nodes in a parallel processor has become the major bottleneck of the processor's performance.RDMA(Remote Direct Memory Access) technology has drawn more attention recently due to its capability of transferring a larger amount of data, higher speed and reliability.4DSP(4 Digital Signal Processing) module comprised of Tiger-SHARC201 chip is connected by LVDS(Low Voltage Differential Signal) circuits.This paper proposes a general and reconfigurable RDMA platform and its corresponding communication protocol with all the routes linked based on the zero copy.The protocol transfers message of DSP by interrupting of DMA and is applied on massive remote image impression, which reduces memory needs and working burden of CPU.The experiment results show this platform is efficient, flexible, and expandable of being integrated to a larger scale in the next development stages.