In order to achieve maximization of parallelism, effective distribution of rendering tasks, balance between performance and flexibility in graphics processing pipeline, this article presents design, performance analys...In order to achieve maximization of parallelism, effective distribution of rendering tasks, balance between performance and flexibility in graphics processing pipeline, this article presents design, performance analysis and optimization for multi-core interactive graphics processing unit (MIGPU). This processor integrates twelve processing cores with specific instruction set architecture and many sophisticated application-specific accelerators into a 3D graphics engine. It is implemented on XC6VLX550T field programmable gate array (FPGA). MIGPU supports OpenGL2.0 with programmable front-end processor, vertex shader, plane clipper, geometry transformer, three-D clippers and pixel shaders. For boosting the performance of MIGPU, the relationship model is established between primitive types, vertices, pixels, and the effect of culling, clipping, and memory access, and shows a way to improve the speed up of the graphics pipeline. It is capable of assigning graphics rendering tasks to different processors for efficiency and flexibility. The pixel filling rate can reach to 40 Mpixel/s at its peak performance.展开更多
In order to improve the network performance furthermore, a routing algorithm for 2D-Torus is investigated from the standpoint of load balance for virtual channels. The 2D-Torus network is divided into two virtual netw...In order to improve the network performance furthermore, a routing algorithm for 2D-Torus is investigated from the standpoint of load balance for virtual channels. The 2D-Torus network is divided into two virtual networks and each physical channel is split into three virtual channels. A novel virtual channel allocation policy and a routing algorithm are proposed, in which traffic load is distributed to those three virtual channels in a more load-balanced manner by introducing a random parameter. Simulations of the proposed algorithm are developed with a SystemC-based test bench. The results show that compared with the negative first for Torus networks (NF-T) algorithm, the proposed algorithm can achieve better performance in terms of network latency and throughput under different traffic patterns. It also shows that a routing algorithm with load balance for virtual channels can significantly improve the network performance furthermore.展开更多
To improve the scalability and reduce the implementation complexity of Mesh and Mesh-like networks, the semi-diagonal Torus (SD-Torus) network, a regular and symmetrical intercormection network is proposed. The SD-T...To improve the scalability and reduce the implementation complexity of Mesh and Mesh-like networks, the semi-diagonal Torus (SD-Torus) network, a regular and symmetrical intercormection network is proposed. The SD-Torus network is a combination of a typical 2D-Torus network with two extra diagonal links from northwest to southeast direction for each node. The topological properties of SD-Torus networks are discussed, and a load balanced routing algorithm for SD-Torus is presented. System-C based simulation result shows that, compared with diagonal Mesh (DMesh), diagonal Torus (DTorus) and XMesh networks, the SD-Torus network can achieve high performance with a lower network cost. It makes the SD-Torus network a powerful candidate for the high performance interconnection networks.展开更多
基金supported by the Key National Natural Science Foundation of China under Grant No.61136002the National Natural Science Foundation of China under Grant No.61272120the Natural Science Basic Research Plan in Shaanxi Province of China under Grant No.2013JC2-32
文摘In order to achieve maximization of parallelism, effective distribution of rendering tasks, balance between performance and flexibility in graphics processing pipeline, this article presents design, performance analysis and optimization for multi-core interactive graphics processing unit (MIGPU). This processor integrates twelve processing cores with specific instruction set architecture and many sophisticated application-specific accelerators into a 3D graphics engine. It is implemented on XC6VLX550T field programmable gate array (FPGA). MIGPU supports OpenGL2.0 with programmable front-end processor, vertex shader, plane clipper, geometry transformer, three-D clippers and pixel shaders. For boosting the performance of MIGPU, the relationship model is established between primitive types, vertices, pixels, and the effect of culling, clipping, and memory access, and shows a way to improve the speed up of the graphics pipeline. It is capable of assigning graphics rendering tasks to different processors for efficiency and flexibility. The pixel filling rate can reach to 40 Mpixel/s at its peak performance.
基金supported by the National Natural Science Foundation of China (60976020)
文摘In order to improve the network performance furthermore, a routing algorithm for 2D-Torus is investigated from the standpoint of load balance for virtual channels. The 2D-Torus network is divided into two virtual networks and each physical channel is split into three virtual channels. A novel virtual channel allocation policy and a routing algorithm are proposed, in which traffic load is distributed to those three virtual channels in a more load-balanced manner by introducing a random parameter. Simulations of the proposed algorithm are developed with a SystemC-based test bench. The results show that compared with the negative first for Torus networks (NF-T) algorithm, the proposed algorithm can achieve better performance in terms of network latency and throughput under different traffic patterns. It also shows that a routing algorithm with load balance for virtual channels can significantly improve the network performance furthermore.
基金supported by the National Natural Science Foundation of China (60976020)
文摘To improve the scalability and reduce the implementation complexity of Mesh and Mesh-like networks, the semi-diagonal Torus (SD-Torus) network, a regular and symmetrical intercormection network is proposed. The SD-Torus network is a combination of a typical 2D-Torus network with two extra diagonal links from northwest to southeast direction for each node. The topological properties of SD-Torus networks are discussed, and a load balanced routing algorithm for SD-Torus is presented. System-C based simulation result shows that, compared with diagonal Mesh (DMesh), diagonal Torus (DTorus) and XMesh networks, the SD-Torus network can achieve high performance with a lower network cost. It makes the SD-Torus network a powerful candidate for the high performance interconnection networks.