Efficient string matching

@article{Aho1975EfficientSM,
  title={Efficient string matching},
  author={Alfred V. Aho and Margaret John Corasick},
  journal={Communications of the ACM},
  year={1975},
  volume={18},
  pages={333 - 340}
}
  • A. AhoM. J. Corasick
  • Published 1 June 1975
  • Economics, Computer Science, Education
  • Communications of the ACM
This paper describes a simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text. The algorithm consists of constructing a finite state pattern matching machine from the keywords and then using the pattern matching machine to process the text string in a single pass. Construction of the pattern matching machine takes time proportional to the sum of the lengths of the keywords. The number of state transitions made by the pattern matching… 

Figures from this paper

A Method for Improving String Pattern Matching Machines

This correspondence describes an efficient string pattern matching machine to locate all occurrences of any of a finite number of keywords and phrases in an arbitrary text string. Some conditions are

An Efficient Implementation of Static String Pattern Matching Machines

  • J. Aoe
  • Computer Science
    IEEE Trans. Software Eng.
  • 1989
A technique for implementing a static transition table of a string pattern matching machine which locates all occurrences of a finite number of keywords in a string is described and can be speeded up by a finite straight program without loops.

An efficient implemenatation of string pattern matching machines for a finite number of keywords

  • J. Aoe
  • Computer Science, Economics
    SIGF
  • 1989
A method of implementing a static transition table of a string pattern matching machine to locate all occurrences of a finite number of keywords in a text string by combining the fast access of an array representation with the compactness of a list structure.

A string pattern - matching algorithm

A Fast String-Searching Algorithm for Multiple Patterns

Approximate string matching: a simpler faster algorithm

We give two algorithms for finding all approximate matches of a pattern in a text, where the edit distance between the pattern and the matching text substring is at most k. The first algorithm, which

An Efficient Multi-Attribute Pattern Matching Machine

The proposed algorithm enables us to match set representations containing multiple attributes to locate all occurrences of any of a nite number of the sequence of rule structures (called matching rules) in a sequence of input structures.

Improved Approximate Multiple Pattern String Matching using Consecutive Q Grams of Pattern

To improve the efficiency of basic Shift OR algorithm by reducing the number of false matches that is detected along with the correct matches by the algorithm, proposed Shift OR with consecutive q grams has been implemented.

Approximate Multiple Pattern String Matching using Bit Parallelism: A Review

Bit parallelism enhances the processing speed of the approximate string matching algorithm as it takes the benefit of the internal bit operations taking place in parallel inside the system.
...

References

SHOWING 1-10 OF 20 REFERENCES

Fast Pattern Matching in Strings

An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings, showing that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time.

STRING-MATCHING AND OTHER PRODUCTS

By exploiting the formal similarity of string-matching with integer multiplication, a new algorithm has been obtained with a running time which is only slightly worse than linear.

Programming Techniques: Regular expression search algorithm

A method for locating specific character strings embedded in character text is described and an implementation of this method in the form of a compiler is discussed. The compiler accepts a regular

Automatic generation of efficient lexical processors using finite state techniques

The practical application of the theory of finite-state automata to automatically generate lexical processors is dealt with in this tutorial article by the use of the AED RWORD system, developed at

The Design and Analysis of Computer Algorithms

This text introduces the basic data structures and programming techniques often used in efficient algorithms, and covers use of lists, push-down stacks, queues, trees, and graphs.

Implementation of the substring test by hashing

Tradeoff curves are developed to show minimal cost of file usage by grouping various partially combined indices under conditions offile usage with different fractions of retrieval and update.

Microtext: the design of a microprogrammed finite state search machine for full-text retrieval

The Microtext system represents a new approach to the design and implementation of a full-text retrieval system that integrates hardware, firmware, and software components in an attempt to provide a solution to the problems involved in processing large files of unformatted textual data.

Derivatives of Regular Expressions

In this paper the notion of a derivative of a regular expression is introduced atld the properties of derivatives are discussed and this leads, in a very natural way, to the construction of a state diagram from a regularexpression containing any number of logical operators.

A system for typesetting mathematics

The design and implementation of a system for typesetting mathematics, designed to be easy to learn and to use by people who know neither mathematics nor typesetting, is described.

Regular Expressions and State Graphs for Automata

Algorithms are presented for 1) converting a state graph describing the behavior of an automaton to a regular expression describing the behavior of the same automaton (section 2), and 2) for