It’s just data

RSS 2.funky

Randy Charles Martin: Funky elements are awarded funky points depending on how funky they are. A really funky element gets as many as 5 funky points, where a somewhat funky element gets only 1 funky point. Summing your funky element scores and you get your total funky score or TFS.

I'm still not sure what funky means in this context.  The best collection of sound bites I've seen on the subject are collected by Philippe Janvier.

Randy gives top funky score to dc:date.  Based on my reading of the subject, I would disagree.  Given the clues above, it would seem to me that it is pubDate that is funky.  After all, in exactly the version of the RSS 0.9x family of specifications which introduced the concept of namespaces that chose to not to respect prior art and to introduce pubDate.  One that is more complicated and harder to get right.  One that is harder to sort in reverse chronological order.  One that is less friendly to our brethren in other countries.


Boy, and I thought *I* had too much free time. :-)

The point (and the previous point that Tim made) is well-taken, though.  This entire issue is ridiculous, it's FUD, and it's a non-issue in the field (every news aggregator that cares about dates understands dc:date).

Posted by Mark at

Mark,

I haven't had a real job in 5 years. I'm paid to surf the Internet, blog and pick my nose.
http://www.kbcafe.com/iBLOGthere4iM/default.aspx?date=20030612#12144024

P.S. My apologies for giving dc:date top marks.

Posted by Randy S Frustrated at

As for the prior art issue, I've been doing some digging, and I am inclined to reverse my previous position (when I stated that pubDate did not respect prior art).

Dublin Core has been around for years, and it's a worldwide ISO standard now.  Many people chastise Dave Winer for not supporting it.  BUT...  RSS 2.0 simply builds on the precedent set by the RSS 0.9x line.  When Dave introduced item-level pubDate in RSS 2.0, he was respecting prior art: the prior art of the original RSS 0.9x line.  Consider:

Netscape's RSS 0.90 (March 1999) had no dates of any kind, but Netscape added a channel-level pubDate in RSS 0.91 (July 1999).  (They also added lastBuildDate.)  These date elements were in RFC-822 format, a format which has persisted (with the addition of 4-digit years) to the present day.

BUT... in 1999, why did Netscape add those elements, with that particular date format?  Netscape took them straight out of Dave's competing syndication format, called [scriptingNews].  (BTW, this is the basis of Dave's claim that he "co-authored" RSS.  Please, let's not go there today, OK?)

Now, [scriptingNews] format predates RSS by a wide margin.  Dave started developing it back in 1997, and it had pubDate in that date format.  But he didn't make up that date format.  He used the one that's always been used in email headers, as defined in RFC 822, which dates back to 1982.  (For Year 2000 compliance, 4-digit years were added in RFC 2822, in April 2001.  I am not making this up.)

Now, the Dublin Core specification itself was stable by 1997 (here's an article from July 1997 on how to use it in HTML: http://www.ariadne.ac.uk/issue10/dublin/ ), BUT (a) it was very new, and (b) all the work being done with it focused on HTML.  I find no record of it being used in XML until 2000, and even then only in the context of RDF/XML.

BUT... the ISO 8601 date format (which is what Dublin Core uses, and is the format that Sam refers to as "easier to get right") did exist in 1997; it dates back to 1988.  In fact, Microsoft's early versions of CDF (submitted to the W3C in March 1997) specified a LastMod element that is a date in ISO 8601 format.

(BUT... the ISO 8601 date format that we know and use today (and that is used in Dublin Core, RSS 1.0, and some RSS 2.0 feeds), didn't exist in 1997; it's really ISO 8601:2000, meaning that there was a new revision of the ISO 8601 standard which was formalized in the year 2000.  The revision was made because ISO 8601:1988 defined an overly wide variety of date formats that, you guessed it, made it very difficult to parse.)

To recap: in 1982, RFC 822 defined a date format.  In 1997, Dave Winer respected prior art by using that date format.  In 1999, Netscape respected prior art by takign elements from Dave's format and not changing the date format.  In 2000, Dave Winer continued the RSS 0.9x line and respected his own and Netscape's prior art by not changing the date format.  In 2002, Dave Winer respected this entire line of prior art by adding item-level pubDate, with the same date format.

Now, none of this is to suggest that namespaces are bad.  That's still ridiculous.  Using Dublin Core is also respecting prior art, just a different lineage of prior art.  Using it in RSS is absolutely legitimate, and every news aggegator I know of (that cares about dates) supports it.

Furthermore, Dublin Core and ISO 8601 have "won" in the marketplace; virtually no one uses the RFC (2)822 format, except email (which has always used it) and RSS 0.9x/2.0.  If I were creating a brand new format today, any kind of format, for any reason, I would absolutely use the ISO 8601 date format.  If namespaces were in the picture, I would absolutely use Dublin Core, straight up.  It's here, it works, it's its own ISO standard now.

References:
- http://www.purplepages.ie/RSS/netscape/rss0.90.html
- http://my.netscape.com/publish/formats/rss-spec-0.91.html
- http://www.faqs.org/rfcs/rfc822.html
- http://www.faqs.org/rfcs/rfc2822.html
- http://my.userland.com/stories/storyReader$11
- http://www.w3.org/TR/NOTE-CDFsubmit.html
- http://www.ariadne.ac.uk/issue10/dublin/
- http://www.cs.tut.fi/~jkorpela/iso8601.html
- http://hydracen.com/dx/iso8601.htm

Posted by Mark at

Nicely done, Mark. That was what I wanted to say over in the Phil-hosted discussion, but you said it so much better than I would have.

So, um, +1?

Posted by Bryant at

I certainly don't want to join the debate of funkiness, however, with regards to RFC 822 and ISO-8601, may I refer you to the HTTP 1.1 specification...

http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.3

"The first format is preferred as an Internet standard and represents a fixed-length subset of that defined by RFC 1123 [8] (an update to RFC 822 [9]). "

HTTP, IMAP, and SMTP (and RSS :) all use RFC 2822.  These are arguably the most used protocols on the internet, I wouldn't say that ISO-8601 has "won" the marketplace.

What am I missing?

Posted by Sterling Hughes at

I second the comment.  Nicely done, Mark.  +1.

Posted by Sam Ruby at

Just as an additional data point, I'll note that some app servers (Coldfusion MX in my case) parse RFC 822 dates without a hitch, but cannot natively parse 8601.

Posted by Roger Benningfield at

I don't think standards defined after 1998 or so are likely to use RFC 1123. On the other hand, good point re: HTTP.

I continue to agitate quietly for just putting both pubDate and dc:date in there.

Posted by Bryant at

Agreed... aside from a few extra bytes of bandwidth, I don't see any reason not to just stick both in there, and let the client parse whichever is easiest for it.

Posted by Roger Benningfield at

For something as small as pubDate, I'm personally not worried about the bandwidth implications.  I'm a bit more concerned about the precedence rules between these elements, but if somebody were to name an actual tool that could benefit, and if this could put a clear and decisive end to the vague 'funky' campaign, then I would be in.

Note: blog entries tend to have a single dates, so I am a bit less concerned here.  I still have an issue with duplication of elements like dc:subject.

Posted by Sam Ruby at

Ah, well. I don't think it'd end the "funky" campaign.

My original take on the idea of coexistence was that it would have been a more effective way to get pubDate into the MT default feeds. The whole "funky" campaign did not seem to be very effective.

I mean, my actual reaction was to remove my old RSS .91 feed, leaving me with just an RSS 1.0 feed. If you asked me how to fix the schism, I'd recommend ignoring people who, in your estimation, seem to be acting irrationally.

Posted by Bryant at

History of RSS date formats

I want to talk about prior art.  But I can't do that yet, because first I need to give my opinion about this "funky RSS" business.... [more]

Trackback from dive into mark

at

How do you sort 1999-12-31T23:00:00-05:00 and 2000-01-01T00:00:00+01:00 in reverse chronological order and in what way is that less hard than sorting the equivalent RFC-822 dates?

I agree that the ISO date looks less friendly to me as a native dutch speaker ;-)

Posted by Curioso at

Easy, but not so easy for me to describe. The parts of an dc:date string are in the best order for simple sorting. It just works, straight off, but the fact that RFC-822 months are letters instead of numbers means that extra processing has to be done to make sure, for example, December doesn't end up before May.

Posted by Marcus at

How do you sort dates in the various formats?  You parse them into dates, and then sort by date.  Why is that hard?

Message from James Robertson at

It's not... but it's easier if you don't have to. Especially if they're held in an array and all you have to do is run the default array sort.

Posted by Marcus at

Marcus, but then you are doing it wrong; my 1999 date is later than my 2000 date, because of the timezone differences. So, it seems that the ISO date is easier to sort wrong, with the RFC-822 date you are at least better aware of the problem of sorting.

I agree with James, you have to parse and convert to a canonical form, and then sort.

Posted by Curioso at

Egads! Of course, you're absolutely right. Unless I'm missing something further, it seems you have extra processing to do either way.

Posted by Marcus at

And to add something to Mark Comments about ISO 8601 which is not freely accessible. The W3C has published a W3C Note which defines the format used in our specifications:

http://www.w3.org/TR/NOTE-datetime

Posted by karl at

In a sense you are missing something. You always have to do extra processing with RFC-822 while 8601 only requires extra parsing IF you are working with or storing timestamps with different timezones. If you are working with timestamps with the same timezone then no addition processing is necessary.

Personally when I am working with different timezones I always normalize my timestamps to the same one before storing them. I do it once and don't think about it. With 8601 that normalization is pretty straight forward math with no lookup tables involved.

Posted by Timothy Appnel at

Such fun.

This is why I use RSS 1.0 and FOAF. Meant to be funky beyond belief.

Posted by Aredridel at

Sam Ruby: RSS 2.funky

[link]...

Excerpt from del.icio.us/tag/RSS at

Add your comment