Home Geschichten Kunst Computer Tindertraum

[current]

(Sunday 21st December 2003)

Just right to lift my Perl-spirit once again here comes an articel (thx Keith) dealing with how the combination of Class::DBI and Template Toolkit can make you're coding work real fun and effective...

Below some quotes from that article I find extremly usefull, if actually unrelated to the Class::DBI thing per se:
We start with a very simple canonicalization--stripping out vowels and collapsing repeated letters. (I've found that this can pick up about half of name misspellings found in the wild, which is pretty impressive.)
sub _canonicalise {
      my ($class, $word) = @_;
      return "" unless $word;
      $word = lc($word);
      $word =~ s/[aeiou]//g;    
      # remove vowels
      $word =~ s/(\w)\1+/$1/eg; 
      # collapse doubled 
      # (or tripled, etc) letters
      return $word;
}

(The matching method can be improved. I've found that neither Text::Soundex nor Text::Metaphone are much of an improvement over the simple approach already detailed, but Text::DoubleMetaphone is definitely worth plugging in, to catch misspellings such as Nicolas/Nicholas and Asimov/Azimof.)

I guess Perl can actually be FUN at times ;) (as if I didn't know anyway)

[ by Martin>] [permalink] [similar entries]

similar entries (vs):

no similar entries (yet?)

similar entries (cg):

no similar entries (yet?)

Martin Spernau
© 1994-2003

traumwind icon Big things to come (TM) 30th Dez 2002

State the problem in words as clearly as possible
Oblique Strategies, Ed.3 Brian Eno and Peter Schmidt



amazon.de Wunschliste





 

usefull links:
Google Graph browser
Traumwind 6-Colormatch
UAV News

powered by SBELT