Given a set U which is consisted of strings defined on alphabet Σ, string cross pattern matching is to find all the matches between every two strings in U. It is utilized in text processing like removing the duplicat...Given a set U which is consisted of strings defined on alphabet Σ, string cross pattern matching is to find all the matches between every two strings in U. It is utilized in text processing like removing the duplication of strings. This paper presents a fast string cross pattern matching algorithm based on extracting high frequency strings. Compared with existing algorithms including single-pattern algorithms and multi-pattern matching algorithms, this algorithm is featured by both low time complexity and low space complexity. Because Chinese alphabet is large and the average length of Chinese words is much short, this algorithm is more suitable to process the text written by Chinese, especially when the size of Σ is large and the number of strings is far more than the maximum length of strings of set U.展开更多
String matching algorithms are an important piece in the network intrusion detection systems. In these systems, the chain coincidence algorithms occupy more than half the CPU process time. The GPU technology has showe...String matching algorithms are an important piece in the network intrusion detection systems. In these systems, the chain coincidence algorithms occupy more than half the CPU process time. The GPU technology has showed in the past years to have a superior performance on these types of applications than the CPU. In this article we perform a review of the state of the art of the different string matching algorithms used in network intrusion detection systems;and also some research done about CPU and GPU on this area.展开更多
Modern applications require large databases to be searched for regions that are similar to a given pattern. The DNA sequence analysis, speech and text recognition, artificial intelligence, Internet of Things, and many...Modern applications require large databases to be searched for regions that are similar to a given pattern. The DNA sequence analysis, speech and text recognition, artificial intelligence, Internet of Things, and many other applications highly depend on pattern matching or similarity searches. In this paper, we discuss some of the string matching solutions developed in the past. Then, we present a novel mathematical model to search for a given pattern and it’s near approximates in the text.展开更多
String matching is seen as one of the essential problems in computer science. A variety of computer applications provide the string matching service for their end users. The remarkable boost in the number of data that...String matching is seen as one of the essential problems in computer science. A variety of computer applications provide the string matching service for their end users. The remarkable boost in the number of data that is created and kept by modern computational devices influences researchers to obtain even more powerful methods for coping with this problem. In this research, the Quick Search string matching algorithm are adopted to be implemented under the multi-core environment using OpenMP directive which can be employed to reduce the overall execution time of the program. English text, Proteins and DNA data types are utilized to examine the effect of parallelization and implementation of Quick Search string matching algorithm on multi-core based environment. Experimental outcomes reveal that the overall performance of the mentioned string matching algorithm has been improved, and the improvement in the execution time which has been obtained is considerable enough to recommend the multi-core environment as the suitable platform for parallelizing the Quick Search string matching algorithm.展开更多
Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Chan...Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Changing Consecutive Characters (PMCCC) to make the searching pro- cess of the algorithm faster. PMCCC enhances the shift process that determines how the pattern moves in case of the occurrence of the mismatch between the pattern and the text. It enhances the Berry Ravindran (BR) shift function by using m consecutive characters where m is the pattern length. The formal basis and the algorithms are presented. The experimental results show that PMCCC made enhancements in searching process by reducing the number of comparisons and the number of attempts. Comparing the results of PMCCC with other related algorithms has shown significant enhancements in average number of comparisons and average number of attempts.展开更多
文摘Given a set U which is consisted of strings defined on alphabet Σ, string cross pattern matching is to find all the matches between every two strings in U. It is utilized in text processing like removing the duplication of strings. This paper presents a fast string cross pattern matching algorithm based on extracting high frequency strings. Compared with existing algorithms including single-pattern algorithms and multi-pattern matching algorithms, this algorithm is featured by both low time complexity and low space complexity. Because Chinese alphabet is large and the average length of Chinese words is much short, this algorithm is more suitable to process the text written by Chinese, especially when the size of Σ is large and the number of strings is far more than the maximum length of strings of set U.
文摘String matching algorithms are an important piece in the network intrusion detection systems. In these systems, the chain coincidence algorithms occupy more than half the CPU process time. The GPU technology has showed in the past years to have a superior performance on these types of applications than the CPU. In this article we perform a review of the state of the art of the different string matching algorithms used in network intrusion detection systems;and also some research done about CPU and GPU on this area.
文摘Modern applications require large databases to be searched for regions that are similar to a given pattern. The DNA sequence analysis, speech and text recognition, artificial intelligence, Internet of Things, and many other applications highly depend on pattern matching or similarity searches. In this paper, we discuss some of the string matching solutions developed in the past. Then, we present a novel mathematical model to search for a given pattern and it’s near approximates in the text.
文摘String matching is seen as one of the essential problems in computer science. A variety of computer applications provide the string matching service for their end users. The remarkable boost in the number of data that is created and kept by modern computational devices influences researchers to obtain even more powerful methods for coping with this problem. In this research, the Quick Search string matching algorithm are adopted to be implemented under the multi-core environment using OpenMP directive which can be employed to reduce the overall execution time of the program. English text, Proteins and DNA data types are utilized to examine the effect of parallelization and implementation of Quick Search string matching algorithm on multi-core based environment. Experimental outcomes reveal that the overall performance of the mentioned string matching algorithm has been improved, and the improvement in the execution time which has been obtained is considerable enough to recommend the multi-core environment as the suitable platform for parallelizing the Quick Search string matching algorithm.
文摘Pattern matching is a very important algorithm used in many applications such as search engine and DNA analysis. They are aiming to find a pattern in a text. This paper proposes a Pattern Matching Algorithm Using Changing Consecutive Characters (PMCCC) to make the searching pro- cess of the algorithm faster. PMCCC enhances the shift process that determines how the pattern moves in case of the occurrence of the mismatch between the pattern and the text. It enhances the Berry Ravindran (BR) shift function by using m consecutive characters where m is the pattern length. The formal basis and the algorithms are presented. The experimental results show that PMCCC made enhancements in searching process by reducing the number of comparisons and the number of attempts. Comparing the results of PMCCC with other related algorithms has shown significant enhancements in average number of comparisons and average number of attempts.