It’s just data

Common Feed Errors

An analysis of a week’s work of click-throughs on Feed Validator [help] links

MissingGuid (1726)
RSS has an identity crisis.
SAXError (653)
People are bozos.  They make all kinds of errors.
UndefinedElement (629)
Yes, it really is pubDate with a capital D.  And no, itunes:category can’t be placed at the item level.
UnexpectedContentType (536)
Why are WordPress feeds served as text/html?
EncodingMismatch (519)
XML on the Web Has FailedBuy the t-shirt.
InvalidRFC2822Date (461)
The single most error prone format on the planet.  Evar.
HttpError (303)
If I can’t get to it, I can’t validate it.  Capisce?
ObsoleteNamespace (235)
What?  Atom 1.0 has been out for a full three months now, and your free hosting provider hasn’t yet upgraded?  Try FeedBurner.
ImageLinkDoesntMatch (206)
I’m still not sure why this is a problem.
UnicodeError (171)
If all you are doing is strcpy’ing your html page into your feed, do us all a favor and add the following line at the top.
<?xml version="1.0" encoding="iso-8859-1"?>

Thanks.

DuplicateDescriptionSemantics (136)
Funky!
InvalidFullLink (135)
If only xml had provided a standard way to declare the base for a given URI…
InvalidContact(120)
People don’t seem to want to reveal their email addresses.  Perhaps they should be told about Dublin Core?
NotHtml (120)
Silent Data Loss.
UnexpectedAttribute (105)
Yes, it really is spelled isPermaLink with a capital P and a capital L.
ContainsHTML (96)
Some people really want to put markup in their titles.
BadCharacters (85)
Generally this means that there are some evil quote characters smarting off again.
SecurityRisk (84)
Beware of the platypus.
MissingDescription (78)
Some people don’t seem to want to provide both a title and a description for their feed.
MissingAttribute (78)
If you are going to include an enclosure element, you might want to put the url in there too. I’m just saying…
ContainsRelRef (76)
People seem to want to put relative URI references in their descriptions too.
DuplicateValue (69)
What part of globally unique do you not understand?
MissingItunesElement (65)
If you are going to submit your podcasts to iTunes, you really should include a category, a language, and indicate whether or not the podcast is “explicit”.  Think of the children.
UndefinedNamedEntity (60)
I don’t care if XHTML defines them in their DTD, DTD’s have not been a part of RSS since the summer of 2000.
NotInANamespace (58)
RSS 2.0 does not permit extensions to define child elements unless those child elements are also in a namespace.

“It’s recommended that you provide the guid, and if possible make it a permalink.”

Why a permalink?

Posted by Graham at

Why a permalink?

I don’t know.  I just copied that text straight from the spec.

Posted by Sam Ruby at

Today's links [March 13, 2006]

Windows RSS Platform Niall Kennedy also blogged about Windows RSS plaform this past weekend Common Feed ErrorsSam Ruby posts “An analysis of a week’s work of click-throughs on Feed Validator”...

Excerpt from Blogging Roller at

Fair enough.

Posted by Graham at

Feed Breakage

Error analysis is important. When you build operating systems, you examine crashlogs. When you run search engines, you look at the searches that produced zero results. When you run a Feed Validator, you look at what kinds of mistakes people make....

Excerpt from ongoing at

For ObsoleteNamespace, FeedBurner’s only a partial solution, it will work for some user-agents but not all of them.
And for Clone Wars, the best part is that the Yahoo feed is still producing 100 duplicate GUIDs.  I don’t care who you are, that’s funny there.

Posted by Gordon Weakliem at

Before getting too uppity, you might want to validate the validator a bit more, Sam :) While I’m sure this feed has a ton of errors, it most certainly exists, despite what Feedvalidator claims. Discovered this today and was quite a bit irritated I couldn’t actually check how broken the feed was :)

Posted by Luis Villa at

Luis,

Something very weird is going on here

>>> import urllib2
>>> urllib2.urlopen('http://cyber.law.harvard.edu/audio/home?func=viewRSS&wid=12')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen
    return _opener.open(url, data)
  File "/usr/lib/python2.4/urllib2.py", line 364, in open
    response = meth(req, response)
  File "/usr/lib/python2.4/urllib2.py", line 471, in http_response
    response = self.parent.error(
  File "/usr/lib/python2.4/urllib2.py", line 402, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.4/urllib2.py", line 337, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.4/urllib2.py", line 480, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found

and

$ curl --head "http://cyber.law.harvard.edu/audio/home?func=viewRSS&wid=12"
HTTP/1.1 404 Not Found
Date: Tue, 14 Mar 2006 04:47:17 GMT
Server: Apache/1.3.34 (Unix) mod_fastcgi/2.4.2 PHP/4.3.11 mod_perl/1.29
Set-Cookie: wgSession=375j4W7jvCZrE; path=/; expires=Fri, 11-Mar-2016 04:47:17 GMT
Content-Type: text/xml; charset=ISO-8859-1
X-Cache: MISS from cyber.law.harvard.edu
Connection: close
Posted by Sam Ruby at

Feed Breakage

Error analysis is important. When you build operating systems, you examine crashlogs. When you run search engines, you look at the searches that produced zero results. When you run a Feed Validator, you look at what kinds of mistakes people make....

Excerpt from diogenius at

That harvard feed is returning a 404 status code AND a body with containing RSS:

C:\temp>irb
irb(main):001:0> require 'net/http'
=> true
irb(main):002:0> Net::HTTP.start('cyber.law.harvard.edu') do |http|
irb(main):003:1* response=http.get('/audio/home?func=viewRSS&wid=12')
irb(main):004:1> puts "Code=#{response.code}"
irb(main):005:1>         p response.body[0,100]
irb(main):006:1> end
Code=404
"<?xml version=\"1.0\"  encoding=\"ISO-8859-1\" ?>
<rss version=\"2.0\" xmlns:creativeCommons=\"http://backe"
=> nil
Posted by Jonno Downes at

links for 2006-03-14

inkling Web 2.0 prediction market game. (tags: economics web2.0 startup) Official Google Research Blog: Hiring: The Lake Wobegon Strategy “You know the Google story: small start-up of highly-skilled programmers in a garage grows into a large...

Excerpt from Edward O'Connor at

Sam Ruby’s amusing run through “a week’s work of click-throughs on Feed Validator [help] links.”...

Excerpt from del.icio.us/tag/python at

Fascinatingly bizarre. FWIW, Firefox (haven’t actually checked in IE) ignores the error code and procedes to allow you to view the XML. But I’ll look into why we’re generating the bad 404 today as well.

Posted by Luis Villa at

IE also ignores the status and happily displays the XML

Posted by Jonno Downes at

I just deployed code that will go ahead and validate the body even on HTTPError, but only if the last line non-blank line is </rss>, </feed>, or </rdf:RDF>.  But the Feed Validator will still report the error, and it will still mark the feed as invalid.

[link]

Posted by Sam Ruby at

Thanks, Sam.

Posted by Luis Villa at

Sam Ruby: Common Feed Errors

Philippe Janvier : Sam Ruby: Common Feed Errors - “An analysis of a week’s work of click-throughs on Feed Validator [help] links” Tags : atom rss...

Excerpt from HotLinks - Level 1 at

The browsers aren’t ignoring the error code at all. Ever wonder how custom 404 pages? Right, the server sends an HTML document which the browser then renders. It’s no different when the server sends an XML body with the 404 response: it just gets rendered. The browsers are simply doing what they always have.

Posted by Aristotle Pagaltzis at

how IE responds to different HTTP status codes

There is a discussion on intertwingly about feed errors, including the case where a server was serving a valid RSS feed with a 404 (file not found) status code. The feedvalidator was reporting the feed as being non-existent, but IE and firefox would...... [more]

Trackback from jamtronix

at

Sam Ruby has compiled a useful (and entertaining) list of the most Common Feed Errors. If you generate your own RSS feeds, it’s worth a look. I occassionally get burnt by evil smart quotes when I copy and paste content into a posting. Interactions...

Excerpt from Bob Congdon at

How IE Handles HTTP Status Codes

Jonno Downes (aka Jamtronix) has performed an experiment designed to work out how IE handles various...... [more]

Trackback from Ken Schaefer

at

Luis Villa (luis): Thu, 16 Mar 2006

Most Bizarre Technological Thing I’ve Been Involved In This Week. Still have no idea how the feed is both being served and generating a 404. On the occasion of the release of GNOME 2.14, I hope everyone in GNOME steps back and takes a moment to...

Excerpt from Planet GNOME at

Links for 2006-03-15 [del.icio.us]

SiliconBeat: The company that Fox Interactive acquired: Newroo Rupert Murdock buys NewRoo, a web-based aggregator Sam Ruby: Common Feed Errors long list of common rss feed formatting issues found by falidator Mobile blogging makes a move: Six Apart...

Excerpt from deeje.com/musings at

Minutiae

A surprisingly large part of a software engineer’s life is dealing with the little things. Much as I like to write about grand designs and architectural issues or people, processes and communities, all too often, the devil is in the details and I...

Excerpt from BlogAfrica at

Social Engineering

While this clearly falls far short of RFC 2119 terminology, for nearly three weeks now, the Feed Validator has issued a warning when it encounters an item in an RSS 2.0 feed that does not contain a GUID. Despite this warning being  exposed to a large... [more]

Trackback from Sam Ruby

at

Sam Ruby: Common Feed Errors

[link]...

Excerpt from Talideon.com Linklog at

Luis Villa (luis): Thu, 16 Mar 2006

Most Bizarre Technological Thing I’ve Been Involved In This Week. Still have no idea how the feed is both being served and generating a 404. On the occasion of the release of GNOME 2.14, I hope everyone in GNOME steps back and takes a moment...

Excerpt from Planet GNOME at

Relative References

I feel strongly that Atom processors need to be able to process relative references in a consistent manner.  But, for now, I’ve restored the use of absolute URIs in my Atom feed, and I will keep it that way for a minimum of 90 days. Looking at  C... [more]

Trackback from Sam Ruby

at

Distribuire feed di qualità

Dopo Atom vs RSS e Feed autodiscovery questa è la terza puntata di una serie che potrebbe essere chiamata l’importanza di servire feed di qualità.Questa volta vorrei prendere spunto dal report pubblicato da Sam Ruby sugli errori più comuni presenti...

Excerpt from edit at

Another Month

Deja Vu. This problem is important to me because truth be told, specs matter, but only so far as they are followed.  For years, RSS had a validator that happily accepted feeds which were not even well formed XML.  We... [more]

Trackback from Sam Ruby

at

Feed mess

Something I’ve been working on at work deals with feeds - I have to read, parse and derive meaning out...... [more]

Trackback from Sriram Krishnan

at

Feed mess

Something I’ve been working on at work deals with feeds - I have to read, parse and derive meaning out of RSS and Atom feeds in the wild. And it’s not been fun. The Universal Feed Parser is nice and everything but I’m still being forced to debug...

Excerpt from Sriram Krishnan at

Bloglines Rocks!

I’ve given Bloglines a fair amount of grief over the past few months over their pathetic-at-the-time handling of Atom feeds.  I’m not ego-centric enough to believe that I got them to change – at most, I may have increased awareness of... [more]

Trackback from Sam Ruby

at

OpenSearch results validation

Given the relaunch of OpenSearch, and given that OpenSearch results can be included in feeds, it seemed time to spend some of my recreational programming time on adding Feed Validator support for the OpenSearch namespace extensions. The spec is cleanly... [more]

Trackback from Sam Ruby

at

The H stands for Hyper

Everybody seems to be linking to Pete Lacey’s The S stands for Simple.  And for good reason.  In addition to being quite funny, I can honestly say — having lived through it myself — that it is quite accurate.  In fact, if one... [more]

Trackback from Sam Ruby

at

Sam Ruby: Common Feed Errors

[link]...

Excerpt from del.icio.us/superwick at

Feed mess

Something I’ve been working on at work deals with feeds - I have to read, parse and derive meaning out of RSS and Atom feeds in the wild.  And it’s not been fun. The Universal Feed Parser is nice and everything but I’m still being forced to debug...

Excerpt from Sriram Krishnan at

Add your comment