[current]
I can't really find answers to:
- What messages are considered when comparing for Junk? All the 'tagged as Junk' ones?
- What happens when I delete messages that are tagges as junk? does Moz 'forget' those examples?
- Is it better to just keep a cretain amount of example Junk around? Or should I save all Spam/Junk?
background on this: I now have an example corpus of about 1300+ Spam/Junk messages, and I noticed a degrade in detection accuracy. Actually, I had a large corpus of Junk mail I trained Moz on, and found it was overzealous. Having now marked a lot of messages as 'not junk' I see the exact oposite, it doesn't detect some obvious Junk at all...
So is it better to just have a rather small (200+) corpus of example Junk or what?
And then, is there a difference between messages not marked at all and messages marked not junk???
[ by Martin>]
[]
[]
similar entries (vs):
- ok, it's proven (# 21%)
- Mozilla Junk (# 20%)
- If you think leaving rude messages will get my attention (# 10%)
- Dave Farquhar on the new naive bayesian spam filter in Mozilla (# 9%)
similar entries (cg):
Martin Spernau
© 1994-2003
Big things to come (TM) 30th Dez 2002
Make an exhaustive list of everything you might do and do the last thing on the list
Oblique Strategies,
Ed.3
Brian Eno and Peter Schmidt
amazon.de Wunschliste
usefull links:
powered by SBELT