Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Chan...Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Changing Consecutive Characters (PMCCC) to make the searching pro- cess of the algorithm faster. PMCCC enhances the shift process that determines how the pattern moves in case of the occurrence of the mismatch between the pattern and the text. It enhances the Berry Ravindran (BR) shift function by using m consecutive characters where m is the pattern length. The formal basis and the algorithms are presented. The experimental results show that PMCCC made enhancements in searching process by reducing the number of comparisons and the number of attempts. Comparing the results of PMCCC with other related algorithms has shown significant enhancements in average number of comparisons and average number of attempts.展开更多
Deep Packet Inspection(DPI)at the core of many monitoring appliances,such as NIDS,NIPS,plays a major role.DPI is beneficial to content providers and censorship to monitor network traffic.However,the surge of network t...Deep Packet Inspection(DPI)at the core of many monitoring appliances,such as NIDS,NIPS,plays a major role.DPI is beneficial to content providers and censorship to monitor network traffic.However,the surge of network traffic has put tremendous pressure on the performance of DPI.In fact,the sensitive content being monitored is only a minority of network traffic,that is to say,most is undesired.A close look at the network traffic,we found that it contains many undesired high frequency content(UHC)that are not monitored.As everyone knows,the key to improve DPI performance is to skip as many useless characters as possible.Nevertheless,researchers generally study the algorithm of skipping useless characters through sensitive content,ignoring the high-frequency non-sensitive content.To fill this gap,in this literature,we design a model,named Fast AC Model with Skipping(FAMS),to quickly skip UHC while scanning traffic.The model consists of a standard AC automaton,where the input traffic is scanned byte-by-byte,and an additional sub-model,which includes a mapping set and UHC matching model.The mapping set is a bridge between the state node of AC and UHC matching model,while the latter is to select a matching function from hash and fingerprint functions.Our experiments show promising results that we achieve a throughput gain of 1.3-2.6 times the original throughput and 1.1-1.3 times Barr’s double path method.展开更多
文摘Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Changing Consecutive Characters (PMCCC) to make the searching pro- cess of the algorithm faster. PMCCC enhances the shift process that determines how the pattern moves in case of the occurrence of the mismatch between the pattern and the text. It enhances the Berry Ravindran (BR) shift function by using m consecutive characters where m is the pattern length. The formal basis and the algorithms are presented. The experimental results show that PMCCC made enhancements in searching process by reducing the number of comparisons and the number of attempts. Comparing the results of PMCCC with other related algorithms has shown significant enhancements in average number of comparisons and average number of attempts.
基金This work was supported by National Natural Science Foundation of China under Grant(Nos.61771166,61771166,61402137)National Key Research&Development Plan of China under Grant 2016QY05X1000。
文摘Deep Packet Inspection(DPI)at the core of many monitoring appliances,such as NIDS,NIPS,plays a major role.DPI is beneficial to content providers and censorship to monitor network traffic.However,the surge of network traffic has put tremendous pressure on the performance of DPI.In fact,the sensitive content being monitored is only a minority of network traffic,that is to say,most is undesired.A close look at the network traffic,we found that it contains many undesired high frequency content(UHC)that are not monitored.As everyone knows,the key to improve DPI performance is to skip as many useless characters as possible.Nevertheless,researchers generally study the algorithm of skipping useless characters through sensitive content,ignoring the high-frequency non-sensitive content.To fill this gap,in this literature,we design a model,named Fast AC Model with Skipping(FAMS),to quickly skip UHC while scanning traffic.The model consists of a standard AC automaton,where the input traffic is scanned byte-by-byte,and an additional sub-model,which includes a mapping set and UHC matching model.The mapping set is a bridge between the state node of AC and UHC matching model,while the latter is to select a matching function from hash and fingerprint functions.Our experiments show promising results that we achieve a throughput gain of 1.3-2.6 times the original throughput and 1.1-1.3 times Barr’s double path method.