Efficient string matching
@article{Aho1975EfficientSM, title={Efficient string matching}, author={Alfred V. Aho and Margaret John Corasick}, journal={Communications of the ACM}, year={1975}, volume={18}, pages={333 - 340} }
This paper describes a simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text. The algorithm consists of constructing a finite state pattern matching machine from the keywords and then using the pattern matching machine to process the text string in a single pass. Construction of the pattern matching machine takes time proportional to the sum of the lengths of the keywords. The number of state transitions made by the pattern matching…
Figures from this paper
2,503 Citations
A Method for Improving String Pattern Matching Machines
- Economics, Computer ScienceIEEE Transactions on Software Engineering
- 1984
This correspondence describes an efficient string pattern matching machine to locate all occurrences of any of a finite number of keywords and phrases in an arbitrary text string. Some conditions are…
An Efficient Implementation of Static String Pattern Matching Machines
- Computer ScienceIEEE Trans. Software Eng.
- 1989
A technique for implementing a static transition table of a string pattern matching machine which locates all occurrences of a finite number of keywords in a string is described and can be speeded up by a finite straight program without loops.
An efficient implemenatation of string pattern matching machines for a finite number of keywords
- Computer Science, EconomicsSIGF
- 1989
A method of implementing a static transition table of a string pattern matching machine to locate all occurrences of a finite number of keywords in a text string by combining the fast access of an array representation with the compactness of a list structure.
A Fast String-Searching Algorithm for Multiple Patterns
- Computer ScienceInf. Process. Manag.
- 1993
Approximate string matching: a simpler faster algorithm
- Computer ScienceSODA '98
- 1998
We give two algorithms for finding all approximate matches of a pattern in a text, where the edit distance between the pattern and the matching text substring is at most k. The first algorithm, which…
An Efficient Multi-Attribute Pattern Matching Machine
- Computer ScienceStringology
- 1996
The proposed algorithm enables us to match set representations containing multiple attributes to locate all occurrences of any of a nite number of the sequence of rule structures (called matching rules) in a sequence of input structures.
Improved Approximate Multiple Pattern String Matching using Consecutive Q Grams of Pattern
- Computer Science
- 2013
To improve the efficiency of basic Shift OR algorithm by reducing the number of false matches that is detected along with the correct matches by the algorithm, proposed Shift OR with consecutive q grams has been implemented.
Approximate Multiple Pattern String Matching using Bit Parallelism: A Review
- Computer Science
- 2013
Bit parallelism enhances the processing speed of the approximate string matching algorithm as it takes the benefit of the internal bit operations taking place in parallel inside the system.
References
SHOWING 1-10 OF 20 REFERENCES
Fast Pattern Matching in Strings
- Computer ScienceSIAM J. Comput.
- 1977
An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings, showing that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time.
STRING-MATCHING AND OTHER PRODUCTS
- Computer Science
- 1974
By exploiting the formal similarity of string-matching with integer multiplication, a new algorithm has been obtained with a running time which is only slightly worse than linear.
Programming Techniques: Regular expression search algorithm
- Computer ScienceCACM
- 1968
A method for locating specific character strings embedded in character text is described and an implementation of this method in the form of a compiler is discussed. The compiler accepts a regular…
Automatic generation of efficient lexical processors using finite state techniques
- Computer ScienceCACM
- 1968
The practical application of the theory of finite-state automata to automatically generate lexical processors is dealt with in this tutorial article by the use of the AED RWORD system, developed at…
The Design and Analysis of Computer Algorithms
- Computer Science
- 1974
This text introduces the basic data structures and programming techniques often used in efficient algorithms, and covers use of lists, push-down stacks, queues, trees, and graphs.
Implementation of the substring test by hashing
- Computer ScienceCACM
- 1971
Tradeoff curves are developed to show minimal cost of file usage by grouping various partially combined indices under conditions offile usage with different fractions of retrieval and update.
Microtext: the design of a microprogrammed finite state search machine for full-text retrieval
- Computer ScienceAFIPS '72 (Fall, part I)
- 1972
The Microtext system represents a new approach to the design and implementation of a full-text retrieval system that integrates hardware, firmware, and software components in an attempt to provide a solution to the problems involved in processing large files of unformatted textual data.
Derivatives of Regular Expressions
- MathematicsJACM
- 1964
In this paper the notion of a derivative of a regular expression is introduced atld the properties of derivatives are discussed and this leads, in a very natural way, to the construction of a state diagram from a regularexpression containing any number of logical operators.
A system for typesetting mathematics
- Computer ScienceCACM
- 1975
The design and implementation of a system for typesetting mathematics, designed to be easy to learn and to use by people who know neither mathematics nor typesetting, is described.
Regular Expressions and State Graphs for Automata
- Computer Science, MathematicsIRE Trans. Electron. Comput.
- 1960
Algorithms are presented for 1) converting a state graph describing the behavior of an automaton to a regular expression describing the behavior of the same automaton (section 2), and 2) for…