Package Info

perl-String-Approx


Perl extension for approximate matching (fuzzy matching)


Development/Libraries/Perl

String::Approx lets you match and substitute strings approximately. With this you can emulate errors: typing errorrs, speling errors, closely related vocabularies (colour color), genetic mutations (GAG ACT), abbreviations (McScot, MacScot).

NOTE: String::Approx suits the task of string matching, not string comparison, and it works for strings, not for text.

If you want to compare strings for similarity, you probably just want the Levenshtein edit distance (explained below), the Text::Levenshtein and Text::LevenshteinXS modules in CPAN. See also Text::WagnerFischer and Text::PhraseDistance. (There are functions for this in String::Approx, e.g. adist(), but their results sometimes differ from the bare Levenshtein et al.)

If you want to compare things like text or source code, consisting of words or tokens and phrases and sentences, or expressions and statements, you should probably use some other tool than String::Approx, like for example the standard UNIX diff(1) tool, or the Algorithm::Diff module from CPAN.

The measure of approximateness is the Levenshtein edit distance. It is the total number of "edits": insertions,

word world

deletions,

monkey money

and substitutions

sun fun

required to transform a string to another string. For example, to transform "lead" into "gold", you need three edits:

lead gead goad gold

The edit distance of "lead" and "gold" is therefore three, or 75%.

String::Approx uses the Levenshtein edit distance as its measure, but String::Approx is not well-suited for comparing strings of different length, in other words, if you want a "fuzzy eq", see above. String::Approx is more like regular expressions or index(), it finds substrings that are close matches.>


License: LGPL-2.0 OR Artistic-2.0
URL: http://search.cpan.org/dist/String-Approx/

Categories

Releases

Package Version Update ID Released Package Hub Version Platforms Subpackages
3.28-bp155.2.10 info GA Release 2023-05-17 15 SP5
  • AArch64
  • ppc64le
  • s390x
  • x86-64
  • perl-String-Approx
3.28-bp154.1.21 info GA Release 2022-05-09 15 SP4
  • AArch64
  • ppc64le
  • s390x
  • x86-64
  • perl-String-Approx
3.28-bp153.1.16 info GA Release 2021-03-06 15 SP3
  • AArch64
  • ppc64le
  • s390x
  • x86-64
  • perl-String-Approx
3.28-bp152.3.13 info GA Release 2020-04-16 15 SP2
  • AArch64
  • ppc64le
  • s390x
  • x86-64
  • perl-String-Approx
3.28-bp151.3.1 info GA Release 2019-07-17 15 SP1
  • AArch64
  • s390x
  • x86-64
  • perl-String-Approx
3.28-bp151.2.14 info GA Release 2019-05-18 15 SP1
  • ppc64le
  • perl-String-Approx
3.28-bp150.2.5 info GA Release 2018-07-30 15
  • AArch64
  • perl-String-Approx
3.28-bp150.2.4 info GA Release 2018-07-30 15
  • ppc64le
  • s390x
  • x86-64
  • perl-String-Approx