It’s just data

Fetch Me A Rock

We have all experienced the pleasure of trying to please either customers or bosses who don’t really know what they want.  It isn’t fun.

Dare Obasanjo recently identified the top 5 sites that, in his opinion, need to get RSS feeds.  When told that the second on his list already had a feed, his response was that “It does have the annoying characteristic of not having dates in it.”.

In some corporate contexts, this process is referred to as “Fetch me a rock”.  It goes something like this: “Fetch me a rock.  No, not that one.  Fetch me another.  Now jump through this hoop.”

Just a few days before, Dare had stumbled across a related date problem.  It seems that RSS 2.0 didn’t have the date he needed, so proposed using a Dublin Core element in a manner other than the way it was defined.  So far, Dare seems to have obtained little traction on this idea.

RSS History

Some trace RSS back to 0.90.  In those days, RSS was viewed as a vehicle for metadata only — whose sole purpose was to direct eyeballs back to one’s site.

At that time, there was one visionary who saw things differently.  His name was Dave Winer.  He saw RSS not as a vehicle for delivering metadata, but for delivering content.  He saw RSS 0.90 as woefully inadequate. It’s missing the key thing web writers and readers need. He had a competing format, named ScriptingNews, from which RSS 0.91 borrowed the description element.

Apparently, Dave was savvy enough to know that if description was mandatory that he would be flamed royally for doing so.  So in RSS 0.91, the most important element for using RSS as a channel for delivering content was the one and only optional element of item.

Dave then followed up by RSS 0.92 which leveled the playing field and made every subelement of item optional.  This was OK, as the other elements that RSS 0.91 had weren’t the vital ones that people really wanted, namely ids and dates.  Those came later.

Boring Committee Work

Fast forward to 2005.  All of the important elements remain more or less optional in RSS, which leads to “interesting” interoperability factors.  Such as the ones that Dare alludes to above.

After literally months of exhaustive discussion, the AtomPub IETF working group has come to consensus on requiring an id and a date on every entry (incidentally, this date is exactly the one that Dare expressed a need for).  Entries also require either a summary or a content which can be rendered as text.

What about people who simply want to syndicate metadata [despite there being a more than adequate format defined for such purposes]?  What about people who are simply trying to scrape together whatever bits they have in the hopes that it might be useful for somebody [despite there being a more than adequate format defined for such purposes]?

Alas, it seems to be the destiny of all standards bodies to take every fork.  If there is somebody who might reasonably be able to produce something, by all means lets enable it, even if it creates downstream confusion.

This leads to the following controversial proposal from PaceTextShouldBeProvided:

atom:entry elements which do not contain an atom:content element, or whose atom:content element’s type attribute indicates a MIME media type, SHOULD contain an atom:summary element.

To some, this is not permissive enough.

Lets hear from users

What, ideally, would people like to see in syndication feeds?

What I am seeking to know is what can we do so that we can get as close as we can to the point where getting a valid Atom feed means that you already know that you are going to have all the information you want.  Even if it means that the producers have to do a little more work.

Once we know what it is, we can decide what goes into the spec itself, and what can go into a BCP document.


In order of importance:

- link to item that is reffered
- summary

- date of last modification
- date of creation

- subject
- full text
- author

- changes that were made, if it’s a modification
- link to comments
- link to broader context (all by this author, all on this site, all about this topic)

Posted by Ģirts Kalniņš at

I’m confused. The charter you linked says the format has to be able to [interoperably] represent “a feed or channel of entries, with or without enclosed content”

Posted by Robert Sayre at

Just the facts, please

: "After literally months of exhaustive discussion, the AtomPub IETF working group has come to consensus on requiring an id...... [more]

Trackback from franklinmint.fm

at

Robert, because I was talking about atom:summary.

Posted by Sam Ruby at

I read that as requiring support for “people who simply want to syndicate metadata”.

Posted by Robert Sayre at

I don’t see the point of requiring summary, content, or even id. Items that don’t meet requirements of the client can be just ignored. Atom has too many MUSTs as is.

Posted by Don Park at

Are the human-readable elements required to be non-empty?

Also, people are already starting to use Atom as a general data envelope for raw xml, so in these cases, summaries and sometimes titles become somewhat meaningless.  Are empty <title/> and <summary/> elements okay here?

Posted by Arthur Davidson Ficke at

Arthur: at the moment, both are allowed.  As I said, there is a controversial proposal that the working group is evaluating which would indicate that such elements SHOULD be non-empty.  In particular, take a look at how Firefox’s live bookmark feature handles feeds with missing or empty titles.

Don and Rob: please take this in the spirit in which it is given.  Go fetch me a rock.

Posted by Sam Ruby at

On a point of information, RSS 0.9 was defined as an RDF vocabulary. One of the characteristics of this approach (apparently not exploited by Netscape’s original application or envisioned by Dave Winer) is that missing isn’t broken, see: [link]

For Atom, as it’s drawing more on the robust XML messaging kind of tradition, I’d disagree with Don and say the MUSTs are a feature. But empty (human) content does seem a reasonable option, unless “this element left intentionally blank” is mandated.

Posted by Danny at

Oops, sorry Sam - s/robust/reliable

Posted by Danny at

I would really like to see ‘links to comments’ as well.  Also include the number of comments at the time of grabbing the feed.

In addition, I think tags (categories) should be passed with feeds.  This could be tags of the entire feed and/or tags of individual posts.

I would love to see a moderation system like slashdot (feedback.)  Where readers could tag the posts they are reading.  For instance, on my blog I write lots of different types of posts, some are funny, some are interesting, some are insightful as they are new ideas.  I would like the community to be able to tag these posts as funny, interesting, ...  This was, posts could be ranked, as the funniest, most interesting, most/least insightful, .... It would also allow for a wealth of metadata to be built on top of the feeds that could be used for social networking, searching, or predictive analysis.  For instance, my RSS Reader could suggest blogs I would like based on my tags and the tags of the community (similar to Tivo.)

Posted by Vincent Lauria at

From my point of view as an aggregator author, there’s only one element I really need: ID. Everything else can be approximated and synthesized at a minimal cost.

From my point of view as an aggregator user, I want titles, dates, links, and summaries. (Proper summaries, not just excerpts.) I’ll settle for full articles.

Posted by John Doty at

Rmail alert - The RSS Blog

<div>It’s the Content</div><div><A href="http://www.intertwingly.net/blog/2005/05/11/Fetch-Me-A-Rock">Sam Ruby</A>: At that time, there was one visionary who saw things differently.  His name was ...... [more]

Trackback from realgeek

at

It's the Content

Sam Ruby: At that time, there was one visionary who saw things differently.  His name was Dave Winer.  He saw RSS not as a vehicle for delivering metadata, but for delivering content....

Excerpt from The RSS Blog at

Rmail alert - The RSS Blog

It’s the ContentSam Ruby: At that time, there was one visionary who saw things differently.  His name was Dave Winer.  He saw RSS not as a vehicle for delivering metadata, but for delivering content.Link...

Excerpt from Geek Space at

You must mean a different rock than the one I brought. Oy. Maybe I’ll make some Patton’s famous rock soup while I am at it.

Posted by Don Park at

What's in your feed?

Sam Ruby asks for the things readers want to see in a feed. My list: Source site URL of item Last Modified Title Author, if source site has multiple authors Full Markup of Entry...

Excerpt from More Like This WebLog at

Sam Ruby: Fetch Me A Rock

[link]...

Excerpt from del.icio.us/tag/syndication at

Sorry about the double ping. Can’t figure out how to turn off outbound trackbacks in MSN spaces. Dare?

Posted by Randy Charles Morin at

Sam Ruby: Fetch Me A Rock

[link]...

Excerpt from del.icio.us/ndanger/todo at

Too many cooks spoil the feed format

Sometimes having a lot of people working on Atom is great, because we get a much wider variety of experience and insight than a smaller group would have available to it. But all too often, it feels like trying to get the US government to move--too...

Excerpt from Info Bite at

BOGU Killed the Elephant

In a recent post, Robert Scoble stepped out of his Microsoft BOGU shadow and landed a knockout punch on Microsoft and Dare Obasanjo. The Scobleizer pointed to a list of live.com’s most recently updated Spaces to note that most of them have no public...

Excerpt from The RSS Blog at

Add your comment