Brent
Simmons: My job is to treat Atom and RSS as peers, and to
do a great job supporting both formats. I do not prefer one over
the other, and I go out of my way to stay far away from the
fighting. (One of the beautiful parts of newsreaders is the
Unsubscribe button.)
Rogers Cadenhead
Rogers Cadenhead was
recently
appointed to the
RSS 2.0
Advisory Board. Shortly thereafter, Rogers
described a technique that will enable you to insert "unusual"
characters like "¿" into your RSS 2.0 feeds without causing
your feed to blow up. In the
ensuing discussion, it became apparent that while he succeeded
in producing a feed that was
entirely valid according to the RSS
2.0 spec, he produced one that simply would not display properly in
the Radio UserLand aggregator. Oopsie.
Rogers later made a
post
that was based on the assertion that software that reads RSS 2.0
assumes — like Radio UserLand apparently does — that things like
channel titles and item titles and item descriptions are entity encoded
HTML. Again, the
resulting discussion provided some revelations on that
score.
The most amazing thing about this entire discussion is that
Rogers maintained his sense of humor throughout, and those that see
value in the Atom specification didn't take the opportunity to go
for the jugular, something I am ashamed to say I have seen happen
too often in previous syndication discussions. IMHO, This
combination of humility and restraint is the key to progress.
Kudos to everybody involved, in particular Rogers.
Reuters does RSS
OK, so some unusual characters work differently across
aggregators. What's the big deal?
But again, another rare problem that occurs in few
aggregators. Edge cases.
Now, view source on the
business
feed. The descriptions are completely valid and conform
to the spec in every way. As you would expect in a business
feed from a company like Reuters, may of these descriptions contain
stock ticker symbols. Now subscribe to this feed (or perhaps
this
snapshot) in your aggregator. Look for the stock ticker
symbols.
We have yet to find a single aggregator which will show the
stock ticker symbols.
Not a single one.
This is called data loss. Silent data loss. We are
not talking about unusual characters. Or the occasional item
in a few aggregators. We are talking stock ticker
symbols. In a Reuters Business feed. In every single
aggregator that we have tested with so far. Silently.
This can be corrected. I was the first to notice the
problem. I let Mark Pilgrim know and he talked to the
responsible person at Reuters. After enumerating the options,
Reuters has elected to update their feeds by wrapping their
descriptions in
CDATA sections. In rare circumstances this may cause
their feeds to become not well formed, but this is the simplest fix
that they can make that will work with the widest range of
aggregators.
I asked Reuter's permission before sharing
this information. They agreed that it is an issue worth
discussing. Publically. Especially given the fix is
something not addressed in the RSS specification itself.
We need to get the word out. Your titles and descriptions
can be 100% valid according to the RSS specification. And yet
not work as you intend in any aggregator.
In an ideal world, the RSS spec would be updated to specify
precisely how these cases are to be handled. However, in a
very practical sense, this would introduce a
discontinuity. Problematic feeds that previously were
technically valid would suddenly become invalid.
Even if this weren't done, the
RSS 2.0
roadmap does leave the door open for clarifications. If
the spec were to be updated to merely say how various textual
elements SHOULD be interpreted, I would gladly update the
feedvalidator to provide informational messages when problematic
values for these elements are detected. The feedvalidator
will remain open source, and Dave and Andrew can chose to update
their
mirror to the latest version at any time.
We can work together to spread the word, reduce surprises, and
improve the user experience with RSS 2.0. I'm confident it
can be done.
RSS Versions
If mangling unusual characters and silent data loss wasn't bad
enough, there is that nearly four year old
fork thing to deal with. Robert Scoble had suggested that
companies with limited resources (like Microsoft?)
only
support RSS 2.0 and Atom. This would rule out a large
number of popular feeds. Including
Slashdot.
BoingBoing.
Google.
Kuro5hin. And, most
importantly, Dilbert.
It is not simply a matter of multiple versions. It is
multiple versions that call themselves RSS. This inevitably
causes some bleed-through. Consider the Dilbert feed
mentioned above. It is RSS 1.0. With guids. Guids
are from RSS 2.0. You won't find a mention of guids in the
RSS 1.0 spec.
Bleed-through goes both ways,
ESPN feeds
are RSS 2.0. With rdf:resource and rdf:about
attributes. From, you guessed it, RSS 1.0. You won't
find a mention of rdf in the RSS 2.0 spec.
There are many elements that were introduced in RSS 2.0 that
duplicate functionallity that was commonly found in namespaces in
RSS 1.0.
Jon
Udell added dublin core information to his feeds in September
of 2002. I
felt
strongly about this. As did
Rael
Dornfest. And
Mark
Pilgrim.
But these new elements were added anyway. And
political FAQs were written that simply boil down to a
suggestion that you should use one set of optional elements instead
of another.
What this means to you is that if you want to support
any version of RSS completely, you essentially need to support all
of them. And there is no central directory which includes all
of them.
But don't despair. There is a
Universal
Feed Parser. It handles every known version of RSS.
It even supports Atom. And CDF. It supports 40
namespaces. It is open source. And even if for some
reason you find you can't use it directly, you can still make use
of the literally thousands of test cases that come with it.
If it can be done by one person, it can be done by others. And it is
worth doing. For the user's sake.
Where do we go from here?
Good question. Essentially, there exists a plurality of
standards today. RSS 1.0 is perhaps the most formal.
RSS 2.0 is the most permissive. Permissiveness increases
adoption rates, but as we see above, at a long term cost of
ambiguity.
Atom is not done yet, but Atom's focus has been on
interoperability and fidelity. There already are some
conformance tests. There will be more. Lots more.
Atom attempts to strike a balance between formality and
simplicity. This comes at a cost of generality.
Example: the rdf:about attribute is required in RSS 1.0 on items.
The corresponding element in RSS 2.0, guid, is optional. In
Atom, entry ids are currently speced as being required. If
you can't generate unique ids for entries, then perhaps Atom is not
the format for you.
Atom has more required elements than RSS. Atom adds type
attributes to titles and links to resolve the ambiguity described
above. It has separate elements for summary and
content. If you want, you can read more
here
or here
or here.
If you chose to, you can even
Get
Involved.
So, if you are a tool vendor and would like a little more structure,
rigor, and reproducibility, Atom might be a good choice. But if
you chose to hold back until Atom is done, that's OK too.
However, if you want to do something quick and dirty in RSS 2.0,
go for it. Guilt free. It will get you up and running
quickly.
The key takeaway here is to beware of anybody who preaches one
true format or one size fits all. Each format has its
strengths. And none of them are going away any time soon.
Meanwhile, you can help by spreading the word. The word is
détente.
RSS 1.0 has a reason to exist. RSS 2.0 has a reason to
exist. And Atom has a reason to exist.
And if anybody tells you differently, and won't listen when you
suggest détente, take Brent's suggestion and make use of the
handy Unsubscribe button. That's what it is there for.
[LIEN] Sam Ruby: Détente
"And if anybody tells you differently, and won't listen when you suggest détente, take Brent's suggestion and make use of the handy Unsubscribe button."...
François Hodierne : Sam Ruby: Détente - "And if anybody tells you differently, and won't listen when you suggest détente, take Brent's suggestion and make use of the handy Unsubscribe button."...
Sam Ruby says: The key takeaway here is to beware of anybody who preaches one true format or one size fits all. Each format has its strengths. And none of them are going away any time soon. Meanwhile, you can......
[more]
Sam Ruby: Meanwhile, you can help by spreading the word. The word is détente. RSS 1.0 has a reason to exist. RSS 2.0 has a reason to exist. And Atom has a reason to exist. And if anybody tells you differently, and...
Your feed conforms to the specification, but contains some GUIDs which are not sufficiently unique. This may cause aggregators like RSSBandit to not show you items in feeds that you are subscribed too. Silently.
Are you sure this is what is happening? It may be that they are reusing IDs in which case RSS Bandit assumes that an entry has had its text updated but does not highlight it as a new entry. Perhaps I should add an option for highlighting changed entries. I have a similar problem with the SourceForge feeds, they always use the same permalinks but their content changes daily. Something else to add to the TODO list.
PS: On a related not to the theme of your blog entry, I recently pinged Scoble and asked him to stop adding his voice to the RSS vs. ATOM conflict. The various flavors of RSS and ATOM are here to stay. No amount of flame wars with people talking past each other and FUDing the other technology is going to change this.
PPS: I got the following error when trying to post to your blog with the CommentAPI. exceptions.UnicodeError - ASCII encoding error: ordinal not in range(128).
Sam,
I wasn't disagreeing that the GUIDs are not unique enough and that this isn't problematic. I was just pointing out that it is more likely that RSS Bandit was downloading the updated feeds just not highlighting posts as new than it was confused by integer GUIDs and not downloading anything at all.
The behavior of GUIDs shared across feeds that I pointed out in my post on atom-syntax is actually a feature that my users keep asking for. I haven't implemented it because I haven't figured out how to do so in a performant manner.
PS: I sent you some mail and I didn't get a response. I couldn't tell if my mail getting caught by spam filters or you felt the mail didn't need a response. Did you get them?
Sam: "RSS 1.0 has a reason to exist. RSS 2.0 has a reason to exist. And Atom has a reason to exist."
Unconditional agreement. I'm not sure that a general acceptance of those three facts will actually lead to anything approximating detente, but it can't hurt.
I sent you some mail and I didn't get a response. I couldn't tell if my mail getting caught by spam filters or you felt the mail didn't need a response. Did you get them?
I guess I did not look closely enough to realize that there were two threads. I have now responded to the one that seemed to be conducted in a constructive tone.
Atom Dude: Meanwhile, you can help by spreading the word. The word is détente. RSS 1.0 has a reason to exist. RSS 2.0 has a reason to exist. And Atom has a reason to exist. And if anybody tells you differently, and won't...
Last update: 28/05/04; 09:33:23 EDT Scripting, Blogging, Softwares... amfphp.org: AMFPHP An Open-Source Alternative for Flash Remoting : Flash remoting for PHP enables objects in PHP to become objects in actionscript, almost magically! AMFPHP takes...
Sam Ruby posted a long and interesting writeup on the differences between RSS and Atom, and what tool builders need to worry about. "[I]f you want to support any version of RSS completely, you essentially need to support all of......
[more]
The last couple of days I spend on Atom, ATOM! To me it is something new to explore, something fresh. Although I probably haven't discovered everything there is to learn about HTML it is always nice to learn new things (ATOM!)....
Sam that "going for the jugular" thing has paralyzed this community. It's good to see you take a side in this, because it's the one thing that no one should support.
As techies we have a common purpose, to see that the technology "just works" for those who don't care about the ones and zeros (or left and right angle brackets). If this idea can take hold, and maybe go further, you'll probably find that detente is too negative a word. I've never liked the idea that people characterize what goes on here as a war, too grandiose, too self-important. These aren't wars, there weblog posts, emails, IRC talks.
Yeah, I wish Atom would be more conservative, only change things that need to be changed, but I'll accept instead some (respectful) answers to practical questions that help me and others build software that works with Atom feeds. Anyway, here's hope that no one goes for my jugular. ;->
In a lunch earlier this week Andrew Grumet and I talked about an advocacy howto for the syndication community, based on the Linux advocacy mini-howto. It's an absolute classic. It goes much further than detente, which is settling for too little, imho. Detente wasn't an equilibrium. Eventually the whole system that supported it crumbled. We'll know that happened when this format stuff fades into the background and all people are talking about are features in weblog tools, CMSes, aggregators, feed readers, weblogs; and then we go way beyond that.
Also, btw, there are some errrors in your advocacy here that keep cropping up in Atom advocacy. Once and for all, there is absolutely nothing official about the list of namespaces on the RSS site. It's just a list. There are good namepsaces that aren't on the list.
Another one that people use, which you (thankfully) haven't -- is that somehow Atom is more extensible that RSS. Can we nip that one in the bud?
Also it's just not respectful to say there are nine versions of RSS. That one caught on, and it's not cool, and it leads to silly he-said-she-said type arguments. Why not just accept, as you basically have here, that there are (unfortunately) two incompatible versions of RSS, it's regrettable, but that's reality.
And while you're at it, since you mention Mark, could he stop saying I stole RSS. I did no such thing. These would all be good things to consider if what you want is an end to hostility.
And of course let me know if there's anything I can do. ;->
As usual, I'm rambling. I'm being called to dinner. I'll be back later to see what other people may have to say.
In case you weren't convinced that Atom was a good idea back in February, here's another opportunity to see the light. (Scoble, I hope you're listening.)...
Dave has a mysterious post up. I am not too much concerned with its content but rather with its form (although said content is actually both amusing, interesting and......
[more]
Detente. It's a good word. "Meanwhile, you can help by spreading the word. The word is détente. RSS 1.0 has a reason to exist. RSS 2.0 has a reason to exist. And Atom has a reason to exist." Okay, I can dig that....
Sam Ruby - Détente (my additions in square brackets): RSS1 has a reason to exist [because it's RDF, so can be integrated into datastores]. RSS2 has a reason to exist [because it's easy]. And Atom has a reason to exist [because it has a clear...
Sam Ruby, in his excellet post Détente: The key takeaway here is to beware of anybody who preaches one true format or one size fits all. Each format has its strengths. And none of them are going away any time soon....
a blog entryI've changed my RSS Feed again (for an explanation of RSS see my introductory post) - it now contains the news items in CDATA tags, so news aggregators will be able to display the item's HTML. I had the motivation from intertwingly.net,...
Sam Ruby: syndication format détente. "There is a Universal Feed Parser. It handles every known version of RSS. It even supports Atom. And CDF. It supports 40 namespaces. It is open source. And even if for some reason you find you can't use...
I can't believe we are arguing about a syndication protocol that's not even supposed to be human readable but we are and it seems like the whole RSS vs ATOM debate is going to continue. Dave Winer just launched a new website called Really Simple...
Sam Ruby says: The key takeaway here is to beware of anybody who preaches one true format or one size fits all. Each format has its strengths. And none of them are going away any time soon. Meanwhile, you can......
It may devolve as it has so many times before, but Sam Ruby's Detente post produced an equally constructive response from Dave Winer in kind. Standards-making in the RSS era may not be pretty, but it may get the job done. I read the comments on both...
The quicker the marketing industry can develop standard usage and reporting metrics around RSS (and other similar xml feed standards), the better. The current situation, where online 'media properties' owners (including portals, publishers...
the Syndication Wars -- they're like the Browser Wars, only much more boring and with far less at stake (reference) In other news: OMG, silent data loss, the sky is falling! If I panicked this way everytime I realized I had to apply HTML quoting, I...
Maybe I am a glutton for punishment, but I have been trying to follow the whole issue of Silent Data Loss that Sam Ruby mentioned during his offer of "Detente(Detente)":[link] last week. I thought it...
Same Ruby: Now, view source on the business feed. The descriptions are completely valid and conform to the spec in every way. As you would expect in a business feed from a company like Reuters, may of these descriptions contain stock ticker...
Ian Bicking: Syndication: boring war where only egos are hurt
21:49 01.06.2004 Syndication: boring war where only egos are hurt the Syndication Wars -- they're like the Browser Wars, only much more boring and with far less at stake (reference) In other news: OMG, silent data loss, the sky is falling! If I...
Maybe I am a glutton for punishment, but I have been trying to follow the whole issue of Silent Data Loss that Sam Ruby mentioned during his offer of "Detente(Detente)":[link] last week. I thought it...
Sam, you've deleted a bunch of my posts here, and marked up others. Now, I'm closing the discussion about this issue on my weblog, I would appreciate if you stopped posting there, and asked Mark to stop making an issue of it. It's high time for you to adopt your own important proposal of "detente" -- which doesn't mean you and Mark get to act like children, while everyone else has to tolerate it.
And of course you'll probably delete this, as you have deleted so many other posts.
BTW, I have a lot of other work to do and I'm getting sick, so I'm not available to have a debate with you about this.
"A software developer that worked this way, on receipt of a bug report from a user, would simply blame the user for the bug, and when that didn't work, say it's not a bug at all. Now certainly some developers do this, but we are harshly critical of them. It's time to apply the same standards to journalists." [link]
When will it be time to apply the same standard to spec maintainers?
Sam,
I don't think it helps your case or the case for Atom to ask for Detente and have your mouthpiece go around bashing everybody. If the Atom community wishes Detente, then surely they should muzzle him too!
Feel free to strike the entire post, iM :) not the usual suspect.
I'm not sure what, if anything, it means, but I did find one reader that doesn't lose the ticker symbols: EffNews appears to look at what's between once-encoded angle brackets, and if it detects HTML then it strips everything (known HTML or ticker symbol or number in an equation) between angle brackets, but if it doesn't find any known tags, it displays them as text.
Steve Gillmor: Gates Paying Attention to RSS. "Bill Gates finally speaks the 'R' word as he highlights the increasingly strategic role of RSS in Microsoft's seamless computing direction." Even Google seems to be using the 'R' word, at least......
[more]
So this is first attempt at catching up since resurfacing. We'll see what happens :) "Whatever else history may say about me when I’m gone, I hope it will record that I appealed to your best hopes, not your worst fears; to your confidence...
I'm pondering. Mark has shown there are 9 syntax-incompatible versions of RSS. In discussion this often gets trimmed down to RSS 0.91, 2.0 and 1.0. Plus Atom of course. Version 0.91 was (is?) very popular, but isn't a particularly safe choice...
In reality, I use virtually the same code to handle all variants of RSS 0.9x in BottomFeeder. Heck, the code for 2.0 is simply a few more fields to handle. The only one of the RSS variants that requires special effort is 1.0 (RDF).
Sam Ruby: Meanwhile, you can help by spreading the word. The word is détente. RSS 1.0 has a reason to exist. RSS 2.0 has a reason to exist. And Atom has a reason to exist. And if anybody tells you differently, and...
As someone who reads weblogs in say an aggregator, you won’t be particularly interested in the format of the feed, as long as your aggregator can support it. Due to the lack of strictness in the RSS spec, you may end up missing information because...
crschmidt’s now playing with In-Feed Feedback, implementing the thing as a WordPress plugin (”doing something useful” would be far more accurate than “playing second fiddle” ;-) Anyhow, I did contemplate pushing the...
NewsGator API Homepage - the highly-regarded NewsGator aggregator now has an Online API (i.e. Web service interface), primarily designed for managing aggregator state across machines. On skimming, the docs look very well put together. You need to...
Elliotte Rusty Harold has posted some slides/notes on syndication for a class he’s been teaching. Good stuff, and I’ve no doubt he covers a lot more than what’s in the slides, but here are one or two points I’d emphasize. Quick one, I’d tweak one...
For shame. Why can’t they “Just” use XML, like everybody else does? It turns out that they do. Just like everybody else does. http I don’t know about you, but in my mind the “first thing about HTTP” is “HTTP...
[more]
Tim Bray is speaking on Atom as a case study. RSS is the most successful use of XML in existence. If it’s that successful, why replace it? Tim outlines some problems with RSS as specified: The RSS specification says......
[more]
This is adapted from my talk of the same name at ETech 2006. The talk’s sections were entitled Why?, How?, What?, and Lessons?; I’ve left out What?, the description of what Atom is, since we’ve had plenty of that around here. That leaves Why we...
Tim Bray: RSS 2.0 has the biggest mindshare and market share; deservedly, in my opinion. Most people, starting from scratch, would (correctly) pick 2.0 as the RSS version to go with. [cut] There are some issues with RSS 2.0. [cut] The conclusion is...
Sam Ruby: Détente Suppose you are Reuters. You produce a feed with BusinessNews. Not a non-ASCII character in sight. Your feed conforms to the specification, but contains some GUIDs which are not sufficiently unique. This may cause aggregators like...
A few days ago, our CEO Jonathan Schwartz sent a letter to SEC Chairman Christopher Cox calling for SEC financial-disclosure regulations to allow for publishing material financials on the Web. It’s obviously a good idea, but there are some...
Continuing on yesterday’s topic of RSS, I am now starting to understand the importance of RSS standards and appreciating the related politics. There appears to be lots of teeth gnashing and back and forth about these topics. Here is where......