With the explosion of network bandwidth and the ever-changing requirements for diverse network-based applications,the traditional processing architectures,i.e.,general purpose processor(GPP) and application specific...With the explosion of network bandwidth and the ever-changing requirements for diverse network-based applications,the traditional processing architectures,i.e.,general purpose processor(GPP) and application specific integrated circuits(ASIC) cannot provide sufficient flexibility and high performance at the same time.Thus,the network processor(NP) has emerged as an alternative to meet these dual demands for today's network processing.The NP combines embedded multi-threaded cores with a rich memory hierarchy that can adapt to different networking circumstances when customized by the application developers.In today's NP architectures,multithreading prevails over cache mechanism,which has achieved great success in GPP to hide memory access latencies.This paper focuses on the efficiency of the cache mechanism in an NP.Theoretical timing models of packet processing are established for evaluating cache efficiency and experiments are performed based on real-life network backbone traces.Testing results show that an improvement of nearly 70% can be gained in throughput with assistance from the cache mechanism.Accordingly,the cache mechanism is still efficient and irreplaceable in network processing,despite the existing of multithreading.展开更多
This paper describes a solution to build network-processor-based Radio Network Controller (RNC) in all-IP wireless networks, it includes the structure of the 3rd Generation (3G) wireless networks and the role of netw...This paper describes a solution to build network-processor-based Radio Network Controller (RNC) in all-IP wireless networks, it includes the structure of the 3rd Generation (3G) wireless networks and the role of network nodes, such as Base Station (BS), RNC, and Packet-Switched Core Networks (PSCN). The architecture of IXP2800 network processor; the detailed implementation of the solution on IXP2800-based RNC are also covered. This solution can provide scalable IP forward features and it will be widely used in 3G RNCs.展开更多
Today's firewalls and security gateways are required to not only block unauthorized accesses by authenticating packet headers, but also inspect flow payloads against malicious intrusions. Deep inspection emerges as a...Today's firewalls and security gateways are required to not only block unauthorized accesses by authenticating packet headers, but also inspect flow payloads against malicious intrusions. Deep inspection emerges as a seamless integration of packet classification for access control and pattern matching for intrusion prevention. The two function blocks are linked together via well-designed session lookup schemes. This paper presents an architecture-aware session lookup scheme for deep inspection on network processors (NPs). Test results show that the proposed session data structure and integration approach can achieve the OC-48 line rate (2.5 Gbps) with inline stateful content inspection on the Intel IXP2850 NP. This work provides an insight into application design and implementation on NPs and principles for performance tuning of NP-based programming such as data allocation, task partitioning, latency hiding, and thread synchronization.展开更多
High-performance network processors are expected to play an important role in future high-speed routers. This paper focuses on two representative techniques needed for high-performance network processors: hardwired lo...High-performance network processors are expected to play an important role in future high-speed routers. This paper focuses on two representative techniques needed for high-performance network processors: hardwired logic design and multithread design. Using hardwired logic, this paper compares a single-thread design with a multithread design, and proposes general models and principles to analyze the clock frequency and the resource cost for these environments. Then, two IP header processing schemes, one in single-thread mode and the other in double-thread mode, are developed using these principles and the implementation results verified the theoretical calculation.展开更多
We cleveloped a high-speed information retrieval system. The system hased on the IXP 2800 is one of the dedicute device. The velocity of the information retrieval is 6.8 Gb/s. The protocol support Telnet, FTP, SMTP, P...We cleveloped a high-speed information retrieval system. The system hased on the IXP 2800 is one of the dedicute device. The velocity of the information retrieval is 6.8 Gb/s. The protocol support Telnet, FTP, SMTP, POP3 etc. various networks protocols. The information retrieval supports the key word and the natural language process. This paper explains the hardware system, software system and the index of the performance. Key words network processor - IXP2800 - information retrieval - IXA CLC number TP 309 Foundation item: Supported by the National Natural Science Foundation of China (69873016 & 69972017) and the National High Technology Development Program of China (863-301-06-1)Biography: SHI Shu-dong (1963-), male, Ph. D. candidate, research direction: network & information security.展开更多
Recent efforts to add new services to the wide-band code division multiple accesses (WCDMA) system have increased interest in network processor (NP)-based routers that are easy to extend and evolve. In this paper,...Recent efforts to add new services to the wide-band code division multiple accesses (WCDMA) system have increased interest in network processor (NP)-based routers that are easy to extend and evolve. In this paper, an application of NPs in routing engine module (REM) of radio network controller (RNC) in WCDMA system is proposed. The measuring results show that NPs have good performance and efficiency in routing traffic of the communication network and the simulation verifies the fast forwarding function of NPs.展开更多
This paper deals with an in-line network security processor (NSP) design that implements the Internet Protocol Security (IPSec) protocol processing for the 10 Gbps Ethernet. The 10 Gbps high speed data transfer, the I...This paper deals with an in-line network security processor (NSP) design that implements the Internet Protocol Security (IPSec) protocol processing for the 10 Gbps Ethernet. The 10 Gbps high speed data transfer, the IPSec processing including the crypto-operation, the database query, and IPSec header processing are integrated in the design. The in-line NSP is implemented using 65 nm CMOS technology and the layout area is 2.5 mm×3 mm with 360 million gates. A configurable crossbar data transfer skeleton implementing an iSLIP scheduling algorithm is proposed, which enables simultaneous data transfer between the heterogeneous multiple cores. There are, in addition, a high speed input/output data buffering mechanism and design of high performance hardware structures for modules, wherein the transfer efficiency and the resource utilization are maximized and the IPSec protocol processing achieves 10 Gbps line speed. A high speed and low power hardware look-up method is proposed, which effectively reduces the area and power dissipation. The post simulation results demonstrate that the design gives a peak throughput for the Authentication Header (AH) transport mode of 10.06 Gbps with the average test packet length of 512 bytes under the clock rate of 250 MHz, and power dissipation less than 1 W is obtained. An FPGA prototype is constructed to verify the function of the design. A test bench is being set up for performance and function verification.展开更多
Task scheduling is an essential aspect of parallel process system. This NP-hard problem assumes fully connected homogeneous processors and ignores contention on the communication links. However, as arbitrary processor...Task scheduling is an essential aspect of parallel process system. This NP-hard problem assumes fully connected homogeneous processors and ignores contention on the communication links. However, as arbitrary processor network (APN), communication contention has a strong influence on the execution time of a parallel application. This paper investigates the incorporation of contention awareness into task scheduling. The innovation is the idea of dynamically scheduling edges to links, for which we use the earliest finish communication time search algorithm based on shortest-path search method. The other novel idea proposed in this paper is scheduling priority based on recursive rank computation on heterogeneous arbitrary processor network. In the end, to reduce time complexity of algorithm, a parallel algorithm is proposed and speedup O(PPE) is achieved. The comparison study, based on both randomly generated graphs and the graphs of some real applications, shows that our scheduling algorithm significantly surpasses classic and static communication contention awareness algorithm, especially for high data transmission rate parallel application.展开更多
基金Supported by the Basic Research Foundation of Tsinghua National Laboratory for Information Science and Technology (TNList)the National High-Tech Research and Development (863) Program of China (No.2007AA01Z468)
文摘With the explosion of network bandwidth and the ever-changing requirements for diverse network-based applications,the traditional processing architectures,i.e.,general purpose processor(GPP) and application specific integrated circuits(ASIC) cannot provide sufficient flexibility and high performance at the same time.Thus,the network processor(NP) has emerged as an alternative to meet these dual demands for today's network processing.The NP combines embedded multi-threaded cores with a rich memory hierarchy that can adapt to different networking circumstances when customized by the application developers.In today's NP architectures,multithreading prevails over cache mechanism,which has achieved great success in GPP to hide memory access latencies.This paper focuses on the efficiency of the cache mechanism in an NP.Theoretical timing models of packet processing are established for evaluating cache efficiency and experiments are performed based on real-life network backbone traces.Testing results show that an improvement of nearly 70% can be gained in throughput with assistance from the cache mechanism.Accordingly,the cache mechanism is still efficient and irreplaceable in network processing,despite the existing of multithreading.
文摘This paper describes a solution to build network-processor-based Radio Network Controller (RNC) in all-IP wireless networks, it includes the structure of the 3rd Generation (3G) wireless networks and the role of network nodes, such as Base Station (BS), RNC, and Packet-Switched Core Networks (PSCN). The architecture of IXP2800 network processor; the detailed implementation of the solution on IXP2800-based RNC are also covered. This solution can provide scalable IP forward features and it will be widely used in 3G RNCs.
基金Supported by the Basic Research Foundation of Tsinghua National Laboratory for Information Science and Technology (TNList)the National High-Tech Research and Development (863) Programof China (No. 2007AA01Z468)
文摘Today's firewalls and security gateways are required to not only block unauthorized accesses by authenticating packet headers, but also inspect flow payloads against malicious intrusions. Deep inspection emerges as a seamless integration of packet classification for access control and pattern matching for intrusion prevention. The two function blocks are linked together via well-designed session lookup schemes. This paper presents an architecture-aware session lookup scheme for deep inspection on network processors (NPs). Test results show that the proposed session data structure and integration approach can achieve the OC-48 line rate (2.5 Gbps) with inline stateful content inspection on the Intel IXP2850 NP. This work provides an insight into application design and implementation on NPs and principles for performance tuning of NP-based programming such as data allocation, task partitioning, latency hiding, and thread synchronization.
基金Supported by the National High-Tech Research and Development (863) Program of China (No. 863-300-01-99) and the National Natural Science Foundation of China (No. 60173009)
文摘High-performance network processors are expected to play an important role in future high-speed routers. This paper focuses on two representative techniques needed for high-performance network processors: hardwired logic design and multithread design. Using hardwired logic, this paper compares a single-thread design with a multithread design, and proposes general models and principles to analyze the clock frequency and the resource cost for these environments. Then, two IP header processing schemes, one in single-thread mode and the other in double-thread mode, are developed using these principles and the implementation results verified the theoretical calculation.
文摘We cleveloped a high-speed information retrieval system. The system hased on the IXP 2800 is one of the dedicute device. The velocity of the information retrieval is 6.8 Gb/s. The protocol support Telnet, FTP, SMTP, POP3 etc. various networks protocols. The information retrieval supports the key word and the natural language process. This paper explains the hardware system, software system and the index of the performance. Key words network processor - IXP2800 - information retrieval - IXA CLC number TP 309 Foundation item: Supported by the National Natural Science Foundation of China (69873016 & 69972017) and the National High Technology Development Program of China (863-301-06-1)Biography: SHI Shu-dong (1963-), male, Ph. D. candidate, research direction: network & information security.
文摘Recent efforts to add new services to the wide-band code division multiple accesses (WCDMA) system have increased interest in network processor (NP)-based routers that are easy to extend and evolve. In this paper, an application of NPs in routing engine module (REM) of radio network controller (RNC) in WCDMA system is proposed. The measuring results show that NPs have good performance and efficiency in routing traffic of the communication network and the simulation verifies the fast forwarding function of NPs.
基金Project (No. 2011ZX01034-002-002-003) supported by the National Science and Technology Major Projects of the Ministry of Industry and Information Technology, China
文摘This paper deals with an in-line network security processor (NSP) design that implements the Internet Protocol Security (IPSec) protocol processing for the 10 Gbps Ethernet. The 10 Gbps high speed data transfer, the IPSec processing including the crypto-operation, the database query, and IPSec header processing are integrated in the design. The in-line NSP is implemented using 65 nm CMOS technology and the layout area is 2.5 mm×3 mm with 360 million gates. A configurable crossbar data transfer skeleton implementing an iSLIP scheduling algorithm is proposed, which enables simultaneous data transfer between the heterogeneous multiple cores. There are, in addition, a high speed input/output data buffering mechanism and design of high performance hardware structures for modules, wherein the transfer efficiency and the resource utilization are maximized and the IPSec protocol processing achieves 10 Gbps line speed. A high speed and low power hardware look-up method is proposed, which effectively reduces the area and power dissipation. The post simulation results demonstrate that the design gives a peak throughput for the Authentication Header (AH) transport mode of 10.06 Gbps with the average test packet length of 512 bytes under the clock rate of 250 MHz, and power dissipation less than 1 W is obtained. An FPGA prototype is constructed to verify the function of the design. A test bench is being set up for performance and function verification.
基金Supported by the National Natural Science Foundation of China (Grant Nos. 90715029 and 60603053)the Cultivation Fund of the Key Scientific and Technical Innovation Project, Ministry of Edacation of Chinathe Key Project of Science & Technology of Hunan Province(Grant No. 2006GK2006)
文摘Task scheduling is an essential aspect of parallel process system. This NP-hard problem assumes fully connected homogeneous processors and ignores contention on the communication links. However, as arbitrary processor network (APN), communication contention has a strong influence on the execution time of a parallel application. This paper investigates the incorporation of contention awareness into task scheduling. The innovation is the idea of dynamically scheduling edges to links, for which we use the earliest finish communication time search algorithm based on shortest-path search method. The other novel idea proposed in this paper is scheduling priority based on recursive rank computation on heterogeneous arbitrary processor network. In the end, to reduce time complexity of algorithm, a parallel algorithm is proposed and speedup O(PPE) is achieved. The comparison study, based on both randomly generated graphs and the graphs of some real applications, shows that our scheduling algorithm significantly surpasses classic and static communication contention awareness algorithm, especially for high data transmission rate parallel application.