Home Geschichten Kunst Computer Tindertraum

[current]

(Saturday 22nd November 2003)

There's been some discussion in the blog world about using a Bayesian categorizer to enable a person to discriminate along various interest/non-interest axes. I took a run at this recently and, although my experiments haven't been wildly successful, I want to report them because I think the idea may have merit.

I see one major problem with his setup: each post can and should be able to belong to multiple categories... But it's a very interesting read

some further thoughts on his blog... (Jon Udell: Working with Bayesian categorizers)

What I think is the one most cripling thing about 'training classifiers' is exactly the training part. What works for spam filterning (there's only two categories good|bad) will fail in case of the user needing to define her multiple cats, and do so well. My approach (which isn't working very well either) tries to circumvent that problem altogether. I'm looking for qay to have posts 'cluster' together, not categorizing them in any specific way beforehand. MY system of Perl scripts (hopefully) simply finds 'similars' to the current piece of text, and let's the user decide (ad hoc) which relations are relevant (in the given moment)
More on that from the past...

[ by Martin>] [permalink] [similar entries]

similar entries (vs):

similar entries (cg):

no similar entries (yet?)

Martin Spernau
© 1994-2003

traumwind icon Big things to come (TM) 30th Dez 2002

The tape is now the music
Oblique Strategies, Ed.3 Brian Eno and Peter Schmidt



amazon.de Wunschliste





 

usefull links:
Google Graph browser
Traumwind 6-Colormatch
UAV News

powered by SBELT