We have all experienced the pleasure of trying to please either
customers or bosses who don’t really know what they
want. It isn’t fun.
Dare Obasanjo recently identified the top 5 sites that, in his
opinion, need to get RSS feeds. When told that the second on
his list already had a feed, his response was that “It does
have the annoying characteristic of not having dates in
it.”.
In some corporate contexts, this process is referred to as
“Fetch me a rock”. It goes something like this:
“Fetch me a rock. No, not that one. Fetch me
another. Now jump through this hoop.”
Just a few days before, Dare had stumbled across a related date
problem. It seems that
RSS 2.0 didn’t have the date he needed, so proposed using
a Dublin Core element in a manner other than the way it was
defined. So far, Dare seems to have obtained little traction
on this idea.
RSS History
Some trace RSS back to
0.90.
In those days, RSS was viewed as a vehicle for metadata only
— whose sole purpose was to direct eyeballs back to
one’s site.
Apparently, Dave was savvy enough to know that if description
was mandatory that he would be flamed royally for doing so.
So in RSS 0.91, the most important element for using RSS as a
channel for delivering content was the
one and only optional element of item.
Dave then followed up by RSS 0.92 which leveled the playing
field and made
every subelement of item optional. This was OK, as the
other elements that RSS 0.91 had weren’t the vital ones that
people really wanted, namely ids and dates. Those came
later.
Boring Committee Work
Fast forward to 2005. All of the important elements remain
more or less optional in RSS, which leads to
“interesting” interoperability factors. Such as
the ones that Dare alludes to above.
After literally months of exhaustive discussion, the
AtomPub
IETF working group has come to consensus on requiring an id and
a date on every entry (incidentally, this date is exactly the one
that Dare expressed a need for). Entries also require either
a summary or a content which can be rendered as text.
What about people who simply want to syndicate metadata [despite
there being a more than adequate format defined for such
purposes]? What about people who are simply trying to scrape
together whatever bits they have in the hopes that it might be
useful for somebody [despite there being a more than adequate
format defined for such purposes]?
Alas, it seems to be the destiny of all standards bodies to
take every fork. If there is somebody who might
reasonably be able to produce something, by all means lets enable
it, even if it creates downstream confusion.
atom:entry elements which do not contain an atom:content
element, or whose atom:content element’s type attribute
indicates a MIME media type, SHOULD contain an atom:summary
element.
What, ideally, would people like to see in syndication
feeds?
What I am seeking to know is what can we do so that we can get
as close as we can to the point where getting a valid Atom feed means that
you already know that you are going to have all the information you want.
Even if it means that the producers have to do a little more
work.
Once we know what it is, we can decide what goes into the spec
itself, and what can go into a
BCP
document.
In order of importance:
- link to item that is reffered
- summary
- date of last modification
- date of creation
- subject
- full text
- author
- changes that were made, if it’s a modification
- link to comments
- link to broader context (all by this author, all on this site, all about this topic)
I’m confused. The charter you linked says the format has to be able to [interoperably] represent “a feed or channel of entries, with or without enclosed content”
I don’t see the point of requiring summary, content, or even id. Items that don’t meet requirements of the client can be just ignored. Atom has too many MUSTs as is.
Are the human-readable elements required to be non-empty?
Also, people are already starting to use Atom as a general data envelope for raw xml, so in these cases, summaries and sometimes titles become somewhat meaningless. Are empty <title/> and <summary/> elements okay here?
Arthur: at the moment, both are allowed. As I said, there is a controversial proposal that the working group is evaluating which would indicate that such elements SHOULD be non-empty. In particular, take a look at how Firefox’s live bookmark feature handles feeds with missing or empty titles.
Don and Rob: please take this in the spirit in which it is given. Go fetch me a rock.
On a point of information, RSS 0.9 was defined as an RDF vocabulary. One of the characteristics of this approach (apparently not exploited by Netscape’s original application or envisioned by Dave Winer) is that missing isn’t broken, see: [link]
For Atom, as it’s drawing more on the robust XML messaging kind of tradition, I’d disagree with Don and say the MUSTs are a feature. But empty (human) content does seem a reasonable option, unless “this element left intentionally blank” is mandated.
I would really like to see ‘links to comments’ as well. Also include the number of comments at the time of grabbing the feed.
In addition, I think tags (categories) should be passed with feeds. This could be tags of the entire feed and/or tags of individual posts.
I would love to see a moderation system like slashdot (feedback.) Where readers could tag the posts they are reading. For instance, on my blog I write lots of different types of posts, some are funny, some are interesting, some are insightful as they are new ideas. I would like the community to be able to tag these posts as funny, interesting, ... This was, posts could be ranked, as the funniest, most interesting, most/least insightful, .... It would also allow for a wealth of metadata to be built on top of the feeds that could be used for social networking, searching, or predictive analysis. For instance, my RSS Reader could suggest blogs I would like based on my tags and the tags of the community (similar to Tivo.)
From my point of view as an aggregator author, there’s only one element I really need: ID. Everything else can be approximated and synthesized at a minimal cost.
From my point of view as an aggregator user, I want titles, dates, links, and summaries. (Proper summaries, not just excerpts.) I’ll settle for full articles.
<div>It’s the Content</div><div><A href="http://www.intertwingly.net/blog/2005/05/11/Fetch-Me-A-Rock">Sam Ruby</A>: At that time, there was one visionary who saw things differently. His name was ......
[more]
Sam Ruby: At that time, there was one visionary who saw things differently. His name was Dave Winer. He saw RSS not as a vehicle for delivering metadata, but for delivering content....
It’s the ContentSam Ruby: At that time, there was one visionary who saw things differently. His name was Dave Winer. He saw RSS not as a vehicle for delivering metadata, but for delivering content.Link...
Sam Ruby asks for the things readers want to see in a feed. My list: Source site URL of item Last Modified Title Author, if source site has multiple authors Full Markup of Entry...
Sometimes having a lot of people working on Atom is great, because we get a much wider variety of experience and insight than a smaller group would have available to it. But all too often, it feels like trying to get the US government to move--too...
In a recent post, Robert Scoble stepped out of his Microsoft BOGU shadow and landed a knockout punch on Microsoft and Dare Obasanjo. The Scobleizer pointed to a list of live.com’s most recently updated Spaces to note that most of them have no public...