Home Geschichten Kunst Computer Tindertraum

[current]

what feedster is good for
(Monday 7th April 2003)

reading an RSS search for your own blog... I'd never have catched this one, as it doesn't point to my blog, but to an article I wrote...
Having battled through that lot, the conscientious aggregator writer hits the next big hurdle: Approximately 10% of RSS feeds are badly formed XML! This issue is covered by Mark Pilgrim in Parsing RSS at all costs where he presents an ultra liberal Python RSS parser which uses Python's relatively forgiving sgmllib module. Great, except PHP doesn't have one of those... enter REX, a technique for "shallow parsing" of XML using regular expressions (no, it's not as cludgy as it sounds - in fact Python's sgmllib module is built on the same principles). Martin Spernau has an excellent article showing how REX can be implemented in PHP and demonstrates the technique in a modified version of the MagpieRSS library. Of course, XML purists (with very good reason) advocate ignoring badly formed feeds but as Mark points out, this really isn't a very practical approach.

[ by Martin>] [permalink] [similar entries]

similar entries (vs):

similar entries (cg):

relevant words



Martin Spernau
© 1994-2003

traumwind icon Big things to come (TM) 30th Dez 2002

Lowest common denominator
Oblique Strategies, Ed.3 Brian Eno and Peter Schmidt



amazon.de Wunschliste





 

usefull links:
Google Graph browser
Traumwind 6-Colormatch
UAV News

powered by SBELT