Tinderbox Umlaut encoding
2005-1-31
Hmmm... I'm sure I'm not alone with this: Where and how can I
tell Tinderbox to leave Umlauts (äöü) un-encoded in my (RSS)output?
The problem is that an encoded HTML-entity (like ä
) is not legal inside and XML element, so my feeds always get parse errors. While the unescaped Umlaut would be perfectly legal in an encoding like UTF-8.
I've tried all settngs in HTMLView and Document preferences but can't find a setting to change the Umlaut behavior.
BTW: what encoding does Tinderbox export in? latin-1? utf-8?
[update] The answer is very simple, as most things in Tinderbox: There's an attribute HTMLEntities
thats boolean. Set to false and no conversion happens for that note on export.
The character encoding is another matter: Tinderbox currently exports in MacRoman, that's fairly close to the common latin-1. Using the HTMLEntities conversion everything is safe... for HTML. I'm told that Tinderbox will use latin-1 encoding in future versions, I guess that will be when the Windows version comes out...
Actually I would very much prefer utf-8... But I guess there's good reasons not to go that road.
[update and conclusion:] A rather obvious - in hindsight - solution is NOT to worry about HTMLentities in the XML... No, not using invalid RSS/XML, but by using CDATA sections for the description elements content (the actual post). That way any HTML markup - even illegal XML - can be transported inside the XML...
Similar
<< reading PDFs in Tofu | commanding yourself >>
alles Bild, Text und Tonmaterial ist © Martin Spernau, Verwendung und Reproduktion erfordert die Zustimmung des Authors