CDATA in RSS descriptions

I've verified that the Radio's aggregator omits the descriptions in Brad Wilson's valid rss feed. Update: Phil reports that several other aggreegators have similar problems.

My suggestion for supporting the widest range of aggregators is to add a remove_html filter for the description, and to provide the full content in content:encoded elements.

Phil Ringnalda has additional suggestions.


What's wrong with everyone else's copy of Radio, then? Mine displays his description with no problem whatsoever.

Posted by Phil Ringnalda at

Ah, two different feeds: index.rdf, which autodiscovery grabs, has a single CDATA block and works fine in (my copy of) Radio, but his index.xml is striped CDATA/PCDATA from adding an entity encoded permalink and comments link to the description, and Radio doesn't like that.

Posted by Phil Ringnalda at

And it's not just Radio that doesn't like that: NewzCrawler, Aggie, and at least one other that I forgot in a flurry of testing also disapprove of a mix of CDATA and PCDATA.

Posted by Phil Ringnalda at


So here we are

So here we are, snowed in with no birth control. (191 words)...

Excerpt from dive into mark at


Does anyone still have a copy of the file that caused the original problem? Or some example?

This seems like it would be useful to keep around, and make available for RSS tool builders.

Posted by kellan at

Simple enough to make your own example: take any random RSS file, and in the item description, put a few words in a CDATA section, then a few more in entity encoded PCDATA, then another CDATA section, then a bit more PCDATA. Let's see if I can break Sam's comments with this:

<description><![CDATA[First <del>section</del>]]> Second &lt;em&gt;section&lt;/em&gt;<![CDATA[Third <del>section</del>]]></description>

Say, that worked pretty well, other than having to find a tag that wasn't accepted and interpreted for the CDATA parts. Looking forward to seeing what this does when it goes out in the comment RSS feed, though.

Posted by Phil Ringnalda at

Phil, the comments RSS feeds simply XML escaped the whole thing. The result appears to be both correct and valid. (Whew!)

Had it simply wrapped the whole thing in a <![[CDATA ]]> section, your closing brackets would have broken it.

Posted by Sam Ruby at


Escaped mark-up is wrong!
For further reading:
[link]
[link]

Posted by Pumba at


Oh, so escaped markup is wrong, because it conveys a possible security breach! So, what we will do is to "encode" that breach so that it won't be harmful?!?! Have you considered that the breach will become again a breach as soon as I decode the text? (Cause you know, sooner or later I will HAVE to decode it)...

Posted by Paolo at


I've verified that the Radio's aggregator omits the descriptions in Brad Wilson's valid rss feed. My suggestion for supporting the widest range of aggregators is to add a remove_html filter for the description, and to provide the full content in...

Excerpt from phil ringnalda dot com: Stray CDATA end tags fix: Comments at

Add your comment












Nav Bar