Packet classification (PC) has become the main method to support the quality of service and security of network application. And two-dimeusioual prefix packet classification (PPC) is the popular one. This paper analyz...Packet classification (PC) has become the main method to support the quality of service and security of network application. And two-dimeusioual prefix packet classification (PPC) is the popular one. This paper analyzes the problem of ruler conflict, and then presents a TCAM-based two-dimensional PPC algorithm. This algorithm makes use of the parallelism of TCAM to lookup the longest prefix in one instruction cycle. Then it uses a memory image and associated data structures to eliminate the conflicts between rulers, and performs a fast two-dimeusional PPC. Compared with other algorithms, this algorithm has the least time complexity and less space complexity.展开更多
The growing trend of network virtualization results in a widespread adoption of virtual switches in virtualized environments. However, virtual switching is confronted with great performance challenges regarding packet...The growing trend of network virtualization results in a widespread adoption of virtual switches in virtualized environments. However, virtual switching is confronted with great performance challenges regarding packet classification especially in Open Flow-based software defined networks. This paper first takes an insight into packet classification in virtual Open Flow switching, and points out that its performance bottleneck is dominated by flow table traversals of multiple failed mask probing for each arrived packet. Then we are motivated to propose an efficient packet classification algorithm based on counting bloom filters. In particular, counting bloom filters are applied to predict the failures of flow table lookups with great possibilities, and bypass flow table traversals for failed mask probing. Finally, our proposed packet classification algorithm is evaluated with real network traffic traces by experiments. The experimental results indicate that our proposed algorithm outperforms the classical one in Open v Switch in terms of average search length, and contributes to promote virtual Open Flow switching performance.展开更多
Traditional packet classification for IPv4 involves examining standard 5-tuple of a packet header, source address, destination address, source port, destination port and protocol. With introduction of IPv6 flow label ...Traditional packet classification for IPv4 involves examining standard 5-tuple of a packet header, source address, destination address, source port, destination port and protocol. With introduction of IPv6 flow label field which entails labeling the packets belonging to the same flow, packet classification can be resolved based on 3 dimensions: flow label, source address and desti- nation address. In this paper, we propose a novel approach for the 3-tuple packet classification based on flow label. Besides, by introducing a conversion engine to covert the source-destination pairs to the compound address prefixes, we put forward an algorithm called Reducing Dimension (RD) with dimension reduction capability, which combines heuristic tree search with usage of buck- ets. And we also provide an improved version of RD, called Improved RD (IRD), which uses two mechanisms: path compression and priority tag, to optimize the perforrmnce. To evaluate our algo- rithm, extensive experiraents have been conducted using a number of synthetically generated databas- es. For the memory consumption, the two pro- posed new algorithms only consumes around 3% of the existing algorithms when the number of ill- ters increases to 10 k. And for the average search time, the search time of the two proposed algo- rithms is more than four times faster than others when the number of filters is 10 k. The results show that the proposed algorithm works well and outperforms rmny typical existing algorithms with the dimension reduction capability.展开更多
The feature of Ternary Content Addressable Memories(TCAMs) makes them particularly attractive for IP address lookup and packet classification applications in a router system. However,the limitations of TCAMs impede th...The feature of Ternary Content Addressable Memories(TCAMs) makes them particularly attractive for IP address lookup and packet classification applications in a router system. However,the limitations of TCAMs impede their utilization. In this paper,the solutions for decreasing the power consumption and avoiding entry expansion in range matching are addressed. Experimental results demonstrate that the proposed techniques can make some big improvements on the performance of TCAMs in IP address lookup and packet classification.展开更多
Packet classification has been studied for decades; it classifies packets into specific flows based on a given rule set. As software-defined network was proposed, a recent trend of packet classification is to scale th...Packet classification has been studied for decades; it classifies packets into specific flows based on a given rule set. As software-defined network was proposed, a recent trend of packet classification is to scale the five-tuple model to multi-tuple. In general, packet classification on multiple fields is a complex problem. Although most existing software-based algorithms have been proved extraordinary in practice, they are only suitable for the classic five-tuple model and difficult to be scaled up. Meanwhile, hardware-specific solutions are inflexible and expensive, and some of them are power consuming. In this paper, we propose a universal multi-dimensional packet classification approach for multi-core systems. In our approach, novel data structures and four decomposition-based algorithms are designed to optimize the classification and updating of rules. For multi-field rules, a rule set is cut into several parts according to the number of fields. Each part works independently. In this way, the fields are searched in parallel and all the partial results are merged together at last. To demonstrate the feasibility of our approach, we implement a prototype and evaluate its throughput and latency. Experimental results show that our approach achieves a 40% higher throughput than that of other decomposed-based algorithms and a 43% lower latency of rule incremental update than that of the other algorithms on average. Furthermore, our approach saves 39% memory consumption on average and has a good scalability.展开更多
Packet classification is crucial to the implementation of advanced network services that require the capability to distinguish traffic in different flows, such as access control in firewalls and protocol analysis in i...Packet classification is crucial to the implementation of advanced network services that require the capability to distinguish traffic in different flows, such as access control in firewalls and protocol analysis in intrusion detection systems. This paper proposes a novel packet classification algorithm optimized for multi-core network processors. The proposed algorithm, AggreCuts, has an explicit worst-case search time with modest memory usage. The data structure of AggreCuts is flexible and well-adapted to different types of multi-core platforms. The algorithm on both Intel IXP2850 32-bit and Cavium OCTEON3860 64-bit multi-core platforms was implemented to evaluate the performance of AggreCuts. The experimental results show that AggreCuts outperforms the best-known existing algorithm in terms of memory usage and classification speed.展开更多
The packet classification is a fundamental process in provisioning security and quality of service for many intelligent network-embedded systems running in the Internet of Things(IoT).In recent years,researchers have ...The packet classification is a fundamental process in provisioning security and quality of service for many intelligent network-embedded systems running in the Internet of Things(IoT).In recent years,researchers have tried to develop hardware-based solutions for the classification of Internet packets.Due to higher throughput and shorter delays,these solutions are considered as a major key to improving the quality of services.Most of these efforts have attempted to implement a software algorithm on the FPGA to reduce the processing time and enhance the throughput.The proposed architectures,however,cannot reach a compromise among power consumption,memory usage,and throughput rate.In view of this,the architecture proposed in this paper contains a pipelinebased micro-core that is used in network processors to classify packets.To this end,three architectures have been implemented using the proposed micro-core.The first architecture performs parallel classification based on header fields.The second one classifies packets in a serial manner.The last architecture is the pipeline-based classifier,which can increase performance by nine times.The proposed architectures have been implemented on an FPGA chip.The results are indicative of a reduction in memory usage as well as an increase in speedup and throughput.The architecture has a power consumption of is 1.294w,and its throughput with a frequency of 233 MHz exceeds 147 Gbps.展开更多
Modem network security devices employ packet classification and pattern matching algorithms to inspect packets. Due to the complexity and heterogeneity of different search data structures, it is difficult for existing...Modem network security devices employ packet classification and pattern matching algorithms to inspect packets. Due to the complexity and heterogeneity of different search data structures, it is difficult for existing algorithms to leverage modern hardware platforms to achieve high performance. This paper presents a Structural Compression (SC) method that optimizes the data structures of both algorithms. It reviews both algorithms under the model of search space decomposition, and homogenizes their search data structures. This approach not only guarantees deterministic lookup speed but also optimizes the data structure for efficient implementation oi1 many-core platforms. The performance evaluation reveals that the homogeneous data structure achieves 10Gbps line-rate 64byte packet classification throughput and multi-Gbps deep inspection speed.展开更多
Packet classification on multi-fields is a fundamental mechanism in network equipments,and various classification solutions have been proposed.Because of inherent difficulties,many of these solutions scale poorly in e...Packet classification on multi-fields is a fundamental mechanism in network equipments,and various classification solutions have been proposed.Because of inherent difficulties,many of these solutions scale poorly in either time or space as rule sets grow in size.Recursive Flow Classification(RFC) is an algorithm with a very high classifying speed. However,its preprocessing complexity and memory requirement are rather high.In this paper,we propose an enhanced RFC(ERFC) algorithm,in which a hash-based aggregated bit vector scheme is exploited to speed up its preprocessing procedure.A compressed and cacheable data structure is also introduced to decrease total memory requirement and improve its searching performance.Evaluation results show that ERFC provides a great improvement over RFC in both space requirement and preprocessing time.The search time complexity of ERFC is equivalent to that of RFC in the worst case; and its average classifying speed is improved by about 100%.展开更多
The network switches in the data plane of Software Defined Networking (SDN) are empowered by an elementary process, in which enormous number of packets which resemble big volumes of data are classified into specific f...The network switches in the data plane of Software Defined Networking (SDN) are empowered by an elementary process, in which enormous number of packets which resemble big volumes of data are classified into specific flows by matching them against a set of dynamic rules. This basic process accelerates the processing of data, so that instead of processing singular packets repeatedly, corresponding actions are performed on corresponding flows of packets. In this paper, first, we address limitations on a typical packet classification algorithm like Tuple Space Search (TSS). Then, we present a set of different scenarios to parallelize it on different parallel processing platforms, including Graphics Processing Units (GPUs), clusters of Central Processing Units (CPUs), and hybrid clusters. Experimental results show that the hybrid cluster provides the best platform for parallelizing packet classification algorithms, which promises the average throughput rate of 4.2 Million packets per second (Mpps). That is, the hybrid cluster produced by the integration of Compute Unified Device Architecture (CUDA), Message Passing Interface (MPI), and OpenMP programming model could classify 0.24 million packets per second more than the GPU cluster scheme. Such a packet classifier satisfies the required processing speed in the programmable network systems that would be used to communicate big medical data.展开更多
基金Foundation item: supported by Intel Corporation (No. 9078)
文摘Packet classification (PC) has become the main method to support the quality of service and security of network application. And two-dimeusioual prefix packet classification (PPC) is the popular one. This paper analyzes the problem of ruler conflict, and then presents a TCAM-based two-dimensional PPC algorithm. This algorithm makes use of the parallelism of TCAM to lookup the longest prefix in one instruction cycle. Then it uses a memory image and associated data structures to eliminate the conflicts between rulers, and performs a fast two-dimeusional PPC. Compared with other algorithms, this algorithm has the least time complexity and less space complexity.
基金supported in part by National Natural Science Foundation of China(61272148,61572525,61502056,and 61602525)Hunan Provincial Natural Science Foundation of China(2015JJ3010)Scientific Research Fund of Hunan Provincial Education Department(15B009,14C0285)
文摘The growing trend of network virtualization results in a widespread adoption of virtual switches in virtualized environments. However, virtual switching is confronted with great performance challenges regarding packet classification especially in Open Flow-based software defined networks. This paper first takes an insight into packet classification in virtual Open Flow switching, and points out that its performance bottleneck is dominated by flow table traversals of multiple failed mask probing for each arrived packet. Then we are motivated to propose an efficient packet classification algorithm based on counting bloom filters. In particular, counting bloom filters are applied to predict the failures of flow table lookups with great possibilities, and bypass flow table traversals for failed mask probing. Finally, our proposed packet classification algorithm is evaluated with real network traffic traces by experiments. The experimental results indicate that our proposed algorithm outperforms the classical one in Open v Switch in terms of average search length, and contributes to promote virtual Open Flow switching performance.
基金This paper was supported by the National Natural Science Foundation of China under Crant No. 61003282 the Funda- mental Research Funds for the Central Universities under Crant No. 2011RCI)508+1 种基金 National Basic Research Program of China under Crant No. 2009CB320505 National High Technol-ogy Research and Development Program of China under Oant No. 2011AA010704.
文摘Traditional packet classification for IPv4 involves examining standard 5-tuple of a packet header, source address, destination address, source port, destination port and protocol. With introduction of IPv6 flow label field which entails labeling the packets belonging to the same flow, packet classification can be resolved based on 3 dimensions: flow label, source address and desti- nation address. In this paper, we propose a novel approach for the 3-tuple packet classification based on flow label. Besides, by introducing a conversion engine to covert the source-destination pairs to the compound address prefixes, we put forward an algorithm called Reducing Dimension (RD) with dimension reduction capability, which combines heuristic tree search with usage of buck- ets. And we also provide an improved version of RD, called Improved RD (IRD), which uses two mechanisms: path compression and priority tag, to optimize the perforrmnce. To evaluate our algo- rithm, extensive experiraents have been conducted using a number of synthetically generated databas- es. For the memory consumption, the two pro- posed new algorithms only consumes around 3% of the existing algorithms when the number of ill- ters increases to 10 k. And for the average search time, the search time of the two proposed algo- rithms is more than four times faster than others when the number of filters is 10 k. The results show that the proposed algorithm works well and outperforms rmny typical existing algorithms with the dimension reduction capability.
基金the National Natural Science Foundation of China (No.60532030).
文摘The feature of Ternary Content Addressable Memories(TCAMs) makes them particularly attractive for IP address lookup and packet classification applications in a router system. However,the limitations of TCAMs impede their utilization. In this paper,the solutions for decreasing the power consumption and avoiding entry expansion in range matching are addressed. Experimental results demonstrate that the proposed techniques can make some big improvements on the performance of TCAMs in IP address lookup and packet classification.
基金This work was supported by the National Basic Research 973 Program of China under Grant No. 2012CB315805 and the National Natural Science Foundation of China under Grant Nos. 61472130 and 61702174.
文摘Packet classification has been studied for decades; it classifies packets into specific flows based on a given rule set. As software-defined network was proposed, a recent trend of packet classification is to scale the five-tuple model to multi-tuple. In general, packet classification on multiple fields is a complex problem. Although most existing software-based algorithms have been proved extraordinary in practice, they are only suitable for the classic five-tuple model and difficult to be scaled up. Meanwhile, hardware-specific solutions are inflexible and expensive, and some of them are power consuming. In this paper, we propose a universal multi-dimensional packet classification approach for multi-core systems. In our approach, novel data structures and four decomposition-based algorithms are designed to optimize the classification and updating of rules. For multi-field rules, a rule set is cut into several parts according to the number of fields. Each part works independently. In this way, the fields are searched in parallel and all the partial results are merged together at last. To demonstrate the feasibility of our approach, we implement a prototype and evaluate its throughput and latency. Experimental results show that our approach achieves a 40% higher throughput than that of other decomposed-based algorithms and a 43% lower latency of rule incremental update than that of the other algorithms on average. Furthermore, our approach saves 39% memory consumption on average and has a good scalability.
基金Supported by the National High-Tech Research and Development (863) Program of China (No. 2007AA01Z468)
文摘Packet classification is crucial to the implementation of advanced network services that require the capability to distinguish traffic in different flows, such as access control in firewalls and protocol analysis in intrusion detection systems. This paper proposes a novel packet classification algorithm optimized for multi-core network processors. The proposed algorithm, AggreCuts, has an explicit worst-case search time with modest memory usage. The data structure of AggreCuts is flexible and well-adapted to different types of multi-core platforms. The algorithm on both Intel IXP2850 32-bit and Cavium OCTEON3860 64-bit multi-core platforms was implemented to evaluate the performance of AggreCuts. The experimental results show that AggreCuts outperforms the best-known existing algorithm in terms of memory usage and classification speed.
文摘The packet classification is a fundamental process in provisioning security and quality of service for many intelligent network-embedded systems running in the Internet of Things(IoT).In recent years,researchers have tried to develop hardware-based solutions for the classification of Internet packets.Due to higher throughput and shorter delays,these solutions are considered as a major key to improving the quality of services.Most of these efforts have attempted to implement a software algorithm on the FPGA to reduce the processing time and enhance the throughput.The proposed architectures,however,cannot reach a compromise among power consumption,memory usage,and throughput rate.In view of this,the architecture proposed in this paper contains a pipelinebased micro-core that is used in network processors to classify packets.To this end,three architectures have been implemented using the proposed micro-core.The first architecture performs parallel classification based on header fields.The second one classifies packets in a serial manner.The last architecture is the pipeline-based classifier,which can increase performance by nine times.The proposed architectures have been implemented on an FPGA chip.The results are indicative of a reduction in memory usage as well as an increase in speedup and throughput.The architecture has a power consumption of is 1.294w,and its throughput with a frequency of 233 MHz exceeds 147 Gbps.
文摘Modem network security devices employ packet classification and pattern matching algorithms to inspect packets. Due to the complexity and heterogeneity of different search data structures, it is difficult for existing algorithms to leverage modern hardware platforms to achieve high performance. This paper presents a Structural Compression (SC) method that optimizes the data structures of both algorithms. It reviews both algorithms under the model of search space decomposition, and homogenizes their search data structures. This approach not only guarantees deterministic lookup speed but also optimizes the data structure for efficient implementation oi1 many-core platforms. The performance evaluation reveals that the homogeneous data structure achieves 10Gbps line-rate 64byte packet classification throughput and multi-Gbps deep inspection speed.
基金Supported by the National Basic Research 973 Program of China under Grant No.2009CB320504the National Hi-Tech Research and Development 863 Program of China under Grant Nos.2008AA01A324 and 2009AA01Z210.
文摘Packet classification on multi-fields is a fundamental mechanism in network equipments,and various classification solutions have been proposed.Because of inherent difficulties,many of these solutions scale poorly in either time or space as rule sets grow in size.Recursive Flow Classification(RFC) is an algorithm with a very high classifying speed. However,its preprocessing complexity and memory requirement are rather high.In this paper,we propose an enhanced RFC(ERFC) algorithm,in which a hash-based aggregated bit vector scheme is exploited to speed up its preprocessing procedure.A compressed and cacheable data structure is also introduced to decrease total memory requirement and improve its searching performance.Evaluation results show that ERFC provides a great improvement over RFC in both space requirement and preprocessing time.The search time complexity of ERFC is equivalent to that of RFC in the worst case; and its average classifying speed is improved by about 100%.
文摘The network switches in the data plane of Software Defined Networking (SDN) are empowered by an elementary process, in which enormous number of packets which resemble big volumes of data are classified into specific flows by matching them against a set of dynamic rules. This basic process accelerates the processing of data, so that instead of processing singular packets repeatedly, corresponding actions are performed on corresponding flows of packets. In this paper, first, we address limitations on a typical packet classification algorithm like Tuple Space Search (TSS). Then, we present a set of different scenarios to parallelize it on different parallel processing platforms, including Graphics Processing Units (GPUs), clusters of Central Processing Units (CPUs), and hybrid clusters. Experimental results show that the hybrid cluster provides the best platform for parallelizing packet classification algorithms, which promises the average throughput rate of 4.2 Million packets per second (Mpps). That is, the hybrid cluster produced by the integration of Compute Unified Device Architecture (CUDA), Message Passing Interface (MPI), and OpenMP programming model could classify 0.24 million packets per second more than the GPU cluster scheme. Such a packet classifier satisfies the required processing speed in the programmable network systems that would be used to communicate big medical data.