Some thought on similarity
(Thursday 8th May 2003)

As my readers have gatherd, I'm rather fond of my 'similar entries' feature I have running here. Now I often combine it with a quick keyword search using my brute force regex search.

One thought that allways shows up with any kind of 'autogenerated categorisation' is this: 'Why are those two entries related, or why are they considered related?'
Problem is, there are usually serveral kinds of relation entries can share. Same keywords, linking the same URL, by the same author etc.

I my experience thought rhat usually matters very little. Once one learns to take the results with some curiousness, and is expecting a surprise here and there, most relations auto-generated make perfect sense.

Given the realm I use this kind of FOA in, the 'surprises' are actually welcome. Boosting assosiative recall, finding buried knowledge, making new, unespected connections between materials.

