Home | Geschichten | Kunst | Computer | Tindertraum |
Class::DBI
and Template Toolkit
can make you're coding work real fun and effective...
Class::DBI
thing per se:
We start with a very simple canonicalization--stripping out vowels and collapsing repeated letters. (I've found that this can pick up about half of name misspellings found in the wild, which is pretty impressive.)sub _canonicalise { my ($class, $word) = @_; return "" unless $word; $word = lc($word); $word =~ s/[aeiou]//g; # remove vowels $word =~ s/(\w)\1+/$1/eg; # collapse doubled # (or tripled, etc) letters return $word; }(The matching method can be improved. I've found that neither Text::Soundex nor Text::Metaphone are much of an improvement over the simple approach already detailed, but Text::DoubleMetaphone is definitely worth plugging in, to catch misspellings such as Nicolas/Nicholas and Asimov/Azimof.)
I guess Perl can actually be FUN at times ;) (as if I didn't know anyway)
[ by Martin>] [permalink] [similar entries]
similar entries (vs):
no similar entries (yet?)similar entries (cg):
no similar entries (yet?)