Happy Birthday, Feed Validator
The Feed Validator has been giving advice for five years as of today. From a modest beginning of 300 test cases, there now are over two thousand.
My favorite post on this topic during these past five years is Common Feed Errors. Time to revisit.
- Missing
atom:link
withrel="self"
(3072) - This is a relatively new recommendation from the RSS Advisory Board. One thing I don’t remember sharing before is that typically when I do these checks, I find that fully one out of three feeds have already fixed these messages by the time I can recheck the feed myself. This message appears to be no exception. In addition, it already is fixed in WordPress HEAD. Needless to say, I expect the frequency of this message to go down quickly.
- XML Parsing error:
syntax error
(901) - This is the glaring exception to the one out of three rule I mentioned above. XML Errors in Feeds still is the most systematic analysis of such errors that I am aware of to date. It would be nice if that study were to be updated.
- Email address is missing real name (822)
- Another new recommendation. Again, this should work itself out over time. Adding either your real name, or a recognizable pseudonym, should increase usability with a number of feed aggregators.
item
should contain aguid
element (634)- This is not a new recommendation. From the original RSS 2.0 spec:
In all cases, it’s recommended that you provide the guid, and if possible make it a permalink.
- Undefined
parent
element:child
(574) - This message covers two separate symptoms: typos and people not knowing about RSS 2.0’s support for namespaces. While this issue is considerably less troublesome than non-well-formed feeds, what is a concern is that this many years after the RSS 2.0 spec was released, this problem is as prevalent as it is.
element
must be an RFC-822 date-time (500)- This continues to be the most problematic date format ever. I’m pleased to see that extensions such as SSE have moved away from it.
- Feeds should not be served with the
type/subtype
media type (479) - Misconfigured servers, often serving feeds as either
text/html
ortext/plain
, have regretfully lead browser vendors and even spec writers to conclude that content sniffing is a necessity. - Your feed appears to be encoded as “
this
”, but your server is reporting “that
” (376) - Another way in which servers are commonly misconfigured: the use of
text/xml
in ways that don’t comply with RFC 3023. - HTTP Error (381)
- It is clear that not everybody has mastered even the most basic concepts of the internet, many still need a bit of help. Don’t laugh, undoubtedly there are areas where you aren’t an expert. Now look again at that count. That many people needed additional help when the Feed Validator said that their feed was not found, or that there is a server error. In the past week alone.
- Image title doesn’t match channel title (278)
- Another, relatively new, recommendation.
- Invalid email address (274)
- In general, this means that people are incorrectly using RSS 2.0 core elements when the Dublin Core extension is what they really want.
- Self reference doesn’t match document location (214)
- Sometimes this simply means that there are multiple URIs which can be used to fetch a feed (example
http://www.example.com/
… vshttp://example.com/
…), but in other cases there is a real problem. element
should not containscript
tag (172)- Most well-maintained aggregators these days strip scripts from incoming feeds, so if you include such things in your feeds with the expectation that users will see the effects, you will often be disappointed. Unfortunately, this often affects embedded YouTube videos.
- Invalid HTML (166)
- While HTML grammar rules are fairly lax (especially when compared against XML), there actually are some rules. While browsers routinely deal with common variations (at times, with minor differences), the more important consideration is that a simple unmatched quote may confuse the code that scans your markup for security risks. This can lead to users seeing widely divergent, often severely stripped, output.
element
should not contain script attribute (150)- Same basic issue, but in this case dealing with attributes like
onclick
. - UnicodeError: decoding error, invalid data (146)
- This is a common enough subclass of well-formedness errors that it merits its own message. And, yes, that means that this count really should be added to the SAX Error count above. Most commonly this error occurs when people write code that essentially does a bit-for-bit copy of data from a webpage (which defaults to
iso-8859-1
encoding), to an XML feed (which defaults toutf-8
). - Invalid URI character (93)
- Most commonly, a space character.
- Undefined named entity (86)
- This is yet still another common enough well-formedness error to merit its own message.
and—
are not predefined in XML. - The XML encoding does not appear to match the characters used (83)
- This is a variation on Unicode Errors. In this case what you have is an incorrect encoding, but one that technically is legal. Like taking a data that is either
utf-8
orwin-1252
encoded and declaring it asiso-8859-1
. In some many cases, what you will see in a feed is incorrect numeric character references, like’
when what is desired is a right single quote or’
. - Incorrect day of week (83)
- All I can say is that the sheer frequency of this error flabbergasts me. People even have been known to protest when they get this message. Again, don’t laugh, one day it could be you.
- Email address is not in recommended format (81)
- Another new recommendation, but one that affects relatively few feeds.
- Missing recommended iTunes
parent
element:child
(75) - Itunes is optional, but if you add itunes elements you might as well follow the recommendations.
element
should not contain HTML (75)- People still try to put escaped HTML in some of the darndest locations. But I am pleased to report that this is down slightly from before.
- Image link doesn’t match channel link (66)
- Another long standing recommendation -- this one is down significantly from prior times.
element
must be a full URI (65)- Also down significantly.