Agrep
Encyclopedia
agrep is a proprietary
fuzzy string searching program, developed by Udi Manber
and Sun Wu between 1988 and 1991, for use with the Unix
operating system. It was later ported to OS/2
, DOS
, and Windows
.
It selects the best-suited algorithm for the current query from a variety of the known fastest (built-in) string searching algorithm
s, including Manber and Wu's bitap algorithm
based on Levenshtein distance
s.
agrep is also the search engine
in the indexer program GLIMPSE
. agrep is free for private and non-commercial use only, and belongs to the University of Arizona.
regular expression library. TRE agrep is more powerful than Wu-Manber agrep since it allows weights and total costs to be assigned separately to individual groups in the pattern. It can also handle Unicode. Unlike Wu-Manber agrep, TRE agrep is licensed under a 2-clause BSD-like license.
FREJ (Fuzzy Regular Expressions for Java) open-source library provides command-line interface which could be used in the way similar to agrep. Unlike agrep or TRE it could be used for constructing complex substitutions for matched text. However its syntax and matching abilities differs significantly from ones of ordinary regular expressions
.
Proprietary software
Proprietary software is computer software licensed under exclusive legal right of the copyright holder. The licensee is given the right to use the software under certain conditions, while restricted from other uses, such as modification, further distribution, or reverse engineering.Complementary...
fuzzy string searching program, developed by Udi Manber
Udi Manber
Udi Manber is an Israeli computer scientist. He is one of the authors of agrep and GLIMPSE. As of April 2008, he is employed by Google as one of their vice presidents of engineering.-Biography:...
and Sun Wu between 1988 and 1991, for use with the Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
operating system. It was later ported to OS/2
OS/2
OS/2 is a computer operating system, initially created by Microsoft and IBM, then later developed by IBM exclusively. The name stands for "Operating System/2," because it was introduced as part of the same generation change release as IBM's "Personal System/2 " line of second-generation personal...
, DOS
DOS
DOS, short for "Disk Operating System", is an acronym for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions 95, 98, and Millennium Edition.Related...
, and Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
.
It selects the best-suited algorithm for the current query from a variety of the known fastest (built-in) string searching algorithm
String searching algorithm
String searching algorithms, sometimes called string matching algorithms, are an important class of string algorithms that try to find a place where one or several strings are found within a larger string or text....
s, including Manber and Wu's bitap algorithm
Bitap algorithm
The bitap algorithm is an approximate string matching algorithm...
based on Levenshtein distance
Levenshtein distance
In information theory and computer science, the Levenshtein distance is a string metric for measuring the amount of difference between two sequences...
s.
agrep is also the search engine
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...
in the indexer program GLIMPSE
GLIMPSE
GLIMPSE is a text indexing and retrieval software program originally developed at the University of Arizona by Udi Manber, Sun Wu, and Burra Gopal. A web server version called WebGlimpse is now being maintained under a pay per line licence. Neither project could be considered open source although...
. agrep is free for private and non-commercial use only, and belongs to the University of Arizona.
Alternative implementations
A more recent agrep is the command-line tool provided with the TRETRE (computing)
TRE is an open-source library for texts search, which works like regular expression engine with ability of fuzzy string searching. It is developed by Ville Laurikari under 2-clause BSD-like license....
regular expression library. TRE agrep is more powerful than Wu-Manber agrep since it allows weights and total costs to be assigned separately to individual groups in the pattern. It can also handle Unicode. Unlike Wu-Manber agrep, TRE agrep is licensed under a 2-clause BSD-like license.
FREJ (Fuzzy Regular Expressions for Java) open-source library provides command-line interface which could be used in the way similar to agrep. Unlike agrep or TRE it could be used for constructing complex substitutions for matched text. However its syntax and matching abilities differs significantly from ones of ordinary regular expressions
Regular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...
.
External links
- Wu-Manber agrep
- [ftp://ftp.cs.arizona.edu/agrep/ For Unix]
- For DOS, Windows and OS/2 home page
- Entry for "agrep" in Christoph's Personal Wiki
- See also
- TRE regexp matching package
- cgrep a command line approximate string matching tool
- nrgrep a command line approximate string matching tool
- agrep as implemented in R