Really Simple Syndication
By Sam Ruby, September 2, 2002.
Preface
From Netscape's RDF
Site Summary (RSS) 0.9 official DTD, proposed:
RSS is an XML/RDF vocabulary for describing metadata about websites, and
enabling the display of "channels" on the "My Netscape"
website.
This document explores the basic concepts behind the various XML grammars
which were derived from this base and makes suggestions as to directions for
their further evolution.
Gentle introduction to RSS 
Let's start this journey by looking at a humble RSS file. In fact, lets
look at mine as produced by two different tools, first Radio,
then blosxom.
There are some differences (primarily additional elements that few render), but
the essential structure is the same, i.e.:
Channels have items. Both channels and items have (optionally, it
turns out) a title, link, and description.
That's pretty all you need to get up and running. Everything else is
gravy. You could write one by hand, but the truth of the matter is that
the overwhelming majority of RSS feeds are written by programs and for programs.
Specifications 
Now lets take a look at a specification. Here's the recently proposed 0.94
spec.
It describes a lot more elements than I described above. Here's a summary
of the ever growing number of elements defined in the core specifications.
RSS 1.0
bucks the trend. Instead of an ever growing number of elements in the core
specifications, extensions are provided by modules.
In fact, there is even a proposed
module capturing the additions made by 0.91. The roadmap
for 0.94 indicates that successors to this version may consider a similar
approach, but for now, additions continue to be made to the core.
Optionallity 
The change notes for
0.94 indicate that <image> is now optional. This surprised
me. Looking at my RSS feeds, including the one produced by Radio, I don't
see an image element. I am not aware of any aggregator having a problem
with my Radio newsfeed, so I can only conclude that this element was effectively
always optional.
Continuous growth 
A slow but continuous rate of growth. Looking again at the change
history for 0.94, it is clear that great pains have been made to ensure that upward
compatibility has been maintained. What this means is that if you
author (by hand or by program) an RSS feed precisely according to the 0.92
specification, your investment will be protected and can be consumed by any
aggregator designed for 0.94.
As I said previously, RSS feeds are typically written by programs and for
programs. There are a number
of programs designed to consume RSS feeds. These programs are known as
aggregators. Now, lets take a moment and
look at compatibility from the consumer's perspective.
RSS 0.91, RSS
0.92, and RSS 0.94 do
not make any claims about backwards
compatibility. What this means that if you are writing an aggregator
(or merely using one),
there are no guarantees that your investment will be protected.
However, things are not as dire as this may seem. Given the excellent
record on upward compatibility, it appears that one can safely assume that the
following changes can be expected: required elements may be made optional, limits
will be lifted, and new elements will be added. The good news is that if there is an element that you are looking for, it's
meaning won't change. Given these observations, it is possible to cope
with change after a fashion. Namespaces would be better. And
they may be coming, just not now. At least not in 0.94.
Version attribute 
Now lets look at the version attribute, present at the top of, for example,
RSS 0.92
feeds. Who is this data intended for? I have no research to back this up, but it would
seem to me that most consumers of RSS feeds would ignore this attribute, for two
reasons, both stated previously. First, one can't assume that the data
that follows is valid with respect to such specifications. Second, there
will in all likelihood be other versions of RSS specifications, perhaps not even
written yet, that have to be dealt with by the same aggregator.
Structural differences 
Now lets look at other differences between RSS 0.9/RSS 1.0 and RSS 0.91/RSS
0.92/RSS 0.94. Comparing the latest of each branch one ends up with the
following:
- The name of the outermost element are both TLA's
starting with the letter 'R'. Just different TLA's.
- Both support
the essential <channel>, <item>, <title>, <link>,
and <description> elements described
above.
- Both also support <image> and <textInput> elements, which appear
largely to be holdovers from 0.9.
- <item> elements appear within
<channel>
elements in 0.94, and appear alongside the <channel> in 1.0.
- 1.0 supports
namespaces now, successors to 0.94 may do so in the future.
- 0.94 defines
more elements in the core specification. When you include modules, 1.0
has more elements defined in total.
King for a day 
So lets start with a clean slate and describe what I would like to see in an
RSS 2.0 if I were made king of the world for a day and were free to make
whatever changes I like. Of course, if I were made king of the world for a
day, I would probably devote my time to other matters, but let's not digress
here too much...
Before talking about futures, it helps to establish a framework of
values.
From the very beginning of my career, I've been indoctrinated into the
importance of backwards compatibility. Not just for
producers, but also for consumers. As king, I would ensure that the next
spec explicitly recognizes the importance of this from this point on.
Simplicity. I really L*O*V*E the new name for RSS 0.94.
Really Simple Syndication. Unfortunately, this spec attempts to live up to this
new name by adding still more attributes to the core, albeit optional ones.
Extensibility as described in the RSS 1.0 design
goals, and affirmed by the RSS 0.94 roadmap,
developers should be able to add modules without interfering with each others
work. So this one no longer appears to be controversial.
The plan, day one 
For starters, I would like to see a return to simplicity. Remove from
the core all elements except <channel>, <item>, <title>,
<link> and <description>. And make every one of them optional
except channel. This means that image and textInput would be placed into a
"mod_rss09" module.
Then add in an 0.91 module,
with a key difference. I'd like to see UserLand host the document and have
the RSS 2.0 modules list reflect this. This means that every recipient of
document containing these elements would provide attribution to UserLand, as
well as having the URL where they can find the human readable description for
any such elements. This should be repeated for 0.92 and 0.94. Simon
Fell can host his own description for 0.93.
Of course, every 1.0
module would be a valid 2.0 module.
As to whether the items should be in or out of the channel, or the name of
the outermost element which acts as a container... I would leave such important
decisions to day two.
Conclusions 
I actually don't want to slow down or derail the current 0.94 work. Let
a thousand flowers bloom and all that. But it is helpful occasionally to
revisit first principles. In this case, can every feature of 0.92 justify
itself? If so, great, otherwise, perhaps at some point in the future it
might be worth streamlining the core spec.
Meanwhile, RSS has grown considerably from it's original humble beginnings as
a "site summary" to a syndication format that enable people to
communicate with people without significant investment in infrastructure and
across both time and platform boundaries. Everyone involved, particularly Netscape,
UserLand, and the RSS-DEV
working group deserve our gratitude.
|
|
© Copyright
2002
Sam Ruby
.
Last update:
9/2/2002; 5:18:49 PM
.
This theme is based on the SoundWaves
(blue) Manila theme. |
|