(Tuesday 22nd July 2003)

Are here! Yes, I finally dug into the 'Contextual Network Graph' Search::ContextGraph stuff and came up with a replacement for the VectorSpace based 'similar entries' engine that was running here for some time now. It tool a lot of tweaking of parameters till it gave similar results to the VS setup I had...
The real fun part about it is this: for much the same result, it's blazingly FAST. Where my VS thingie would take about 40+ min. of CPU time to complete a scan, the CG thing takes about 23 sec. !! And that'S w/o using the cool 'store' functionality that let's you store a ContextGraph for later use (a thing tha isn't technically possible with the VS approach)... And, this Search::ContextGraph is a pure Perl module with very little dependancies... VectorSpace needs a C-based extension (PDL)

So much for now, more details when I get back to breath ;)

[update:] something is weird with it on this server... shouldn't it find VectorSpace? Posts with that word in it should be similar to this one, no? Back to the drawing board, I guess...

[update:2003-07-23] ... it appears the results for VectorSpace and ContextGraph are rather similar if the posts are rather short... but with the size of the post CG fails... digging..

