It’s just data

One More Step Forward?

Tim Bray: I’m going to have to go back and patch up the code so it doesn’t emit any of those nasty colons and relative URI references that apparently hurt implementors’ fragile feelings.

As Tim continues to update his post with more and more aggregators that already do support these features, I’m gaining hope that some day I can retire the following Feed Vaidator message: Avoid Namespace Prefix.  Initial testing indicates that Firefox and IE (and presumably, therefore, Thunderbird and all tools based on the Windows RSS Platform) are both in the “good” column here.  Unfortunately, it looks like Opera isn’t.  Furthermore, at least one of the tools in the “bad” column already has a fix in beta, and at least one other tool doesn’t have the namespace prefix problem.  Those that wish to do so are welcome to reference the tests mentioned on the XML Namespace Conformance Tests wiki page, and are invited to update that page with up-to-date results.

With the potential retirement of that one warning message comes the need for another: Relative href value on self link.  Note that this message only occurs if the self link is relative and there is no absolute xml:base attribute value on either the link element or the enclosing feed element.  Tim’s ongoing feed, for example, continues to validate with no warnings.


So, why shouldn’t <link rel="self" href=""/> not be OK?  RFC3986 makes it perfectly clear how to turn that into an absolute URI reference, whether or not there’s an xml:base in effect.  Well, suppose someone saves it into a file?  Then you won’t know how to use it.  Yeah, but that’s true for every URI reference in a given Atom feed. Either relative URI references are OK, or not.  I don’t see why the rel="self" case has special standing.

Posted by Tim at

I’m with Tim.  Just because there are bad implementations out there that fail to even try to do the right thing, that doesn’t mean it should be avoided.  Relative self links, even same doc links, are perfectly legal, acceptable and easily understood.  Consumers of feeds with relative links need to know how to deal with 'em.  Consumers that refuse to deal with them are buggy and crappy and need to be fixed.

Posted by James Snell at

This has got nothing to do with bad implementations. AFAICT the self link was specifically designed with auto subscription in mind (see Joe Gregorio’s post on the subject) and for that purpose it had to be absolute (or derivable). No other links in the feed have that need, since once you know the self link you know the URI from which the feed was obtained, and thus its base URI.

You could make the argument that auto subscription is no longer an issue now that most browsers have built in support for feeds, but in that case you might as well throw out the self link altogether - what other purpose does it really serve?

I don’t want to argue with people that insist on include useless self links in their feeds though - it is legal according to RFC 4287. Just don’t kid yourself that it’s serving some purpose by being there.

Posted by James Holderness at

In aggregated feeds, the self link is also used within the source element to specify the location of the feed the entry originally came from.  However, there’s nothing that says it has to be absolute to be useful; it just needs implementations that properly implement the specs.  It’s really not all that difficult a thing to do, even if it’s not a big priority.

Posted by James Snell at

Unfortunately, it looks like Opera isn’t.

Think you could file a bug, preferably with a clear test case referenced in the report? I know Opera’s BTS sometimes seems like a black hole, but all bug reports are dealt with and appreciated.  Those with proper testcases doubly so.

Posted by Arve at

In the absence of xml:base, <link rel="self" href=""/> is a tautological statement, and as such adds zero information.  As it is information free, it can not serve any purpose, much less the purpose that it was originally designed for.

Posted by Sam Ruby at

Think you could file a bug, preferably with a clear test case referenced in the report?

I don’t have an Opera login, so I can’t actually see the bug report I just entered, but I gather that the following URI will allow those that can to do so:

https://bugs.opera.com/show_bug.cgi?id=285206

Posted by Sam Ruby at

In the absence of xml:base, <link rel="self" href=""/> is...

I just don’t buy it.  Per 3986, it asserts “Wherever you got this from is the preferred place to go and get it again”.  There is no ambiguity.  Of course, if you lost the “where you got this from” context, it becomes useless.  But then so does every other URI reference in my feed.  If you’re going to take Atom elements and send them somewhere or save them or repurpose them, you have to fix up all the relative references before losing context, or you’ve just broken everything, not just link@rel="self".  And if you do that, my construct works fine.

Posted by Tim at

The Sunday Contests

Who needs the NFL? The 0.00001% of the population who think Syndication Semantics and Web Architecture are all about fun ought to cruise by Sam Ruby’s space and take in the discussion around One More Step Forward?...

Excerpt from ongoing at

In the absence of xml:base, <link rel="self" href=""/> is...

I just don’t buy it.

The phrase you elided was “a tautological statement”.

Do you disagree that the statement that empty XML element makes in the absence of xml:base attribute values is a tautology?  Conversely, do you assert that such an element serves any purpose, let alone the original one?

Note: no one is saying that such an element is illegal.  Nor is anyone saying that consumers that do not understand such an element should be fixed.

Posted by Sam Ruby at

This discussion enforces my opinion that there never should have been a self link. Instead the Atom spec should have required (or at least suggested) an xml:base attribute containing the absolute URI of the feed. Which is also a good idea for every other XML document on the web.

Posted by Sjoerd Visscher at

Re “tautological” - that word means a statement that is necessarily true, e.g. “A or not A”, and thus carries no information.

<link rel="self" href=""/>

is in fact information-bearing.  The “” value for href has a very precise meaning; I could have chosen lots of other values that would have had other meanings, but I didn’t. 

What’s a scenario in which the href="" causes a problem?

Posted by Tim at

The “” value for href has a very precise meaning

Per RFC 3986, it is a same document reference.

What’s a scenario in which the href="" causes a problem?

Problem is too strong of a word.

The issue is that the very use case for which this specific — and only this specific — link relation was designed to address is only possible if this link’s href can be resolved to an absolute URI without the necessity of having to refer to any data outside of the feed itself.

Posted by Sam Ruby at

I don’t have an Opera login, so I can’t actually see the bug report I just entered, but I gather that the following URI will allow those that can to do so:

Thanks.

Posted by Arve at

Autumnal mod_atom

I just did a massive check-in on mod_atom , and it’s now not just an Atom Store, it’s also a basic blog publisher. This fragment is about how it works, and includes a confession; I did one fairly awful thing along the way....

Excerpt from ongoing at

Tim:

Of course, if you lost the “where you got this from” context, it becomes useless.

That’s the problem right here: rel="self" is intended for situations where context is lost by design. It was meant to allow browsers to cooperate with aggregators on subscriptions in the same way they cooperate with MP3 players on playlists: when the user clicks the link, the browser sees it’s a media type handled by another program and treats the representation as a binary blob that it never even peers into, it just saves it to disk verbatim and launches the other program with that file as an argument.

The rel="self" link is a protocol for overcoming the inherent loss of context of that interaction. So if you put something in there that cannot be resolved to an absolute URI in absence of context, then there is no point to having a rel="self" link in your feed at all.

Posted by Aristotle Pagaltzis at

rel="self" is intended for situations where context is lost by design.


That is what xml:base is for too.

Posted by Sjoerd Visscher at

Posts in this thread seem to me to say more than is supported by the specifications.

Tim: Per 3986, it asserts “Wherever you got this from is the preferred place to go and get it again”.

RFC 3986 doesn’t contain the string “prefe”.  I can’t see why a document on the syntax of URIs would have anything to say on the subject.  RFC 4287 says The value “self” signifies that the IRI in the value of the href attribute identifies a resource equivalent to the containing element.  Just a equivalent resource, not the preferred equivalent resource.

So <atom:link rel="self" href=""/> seems to me to say that the document is equivalent to itself, and nothing more.  Yes, that seems somewhat tautological to me.  (Of course, if it appeared in an element other than the top level element it would be self-contradictory.)

Aristotle: rel="self" is intended for situations where context is lost by design.  Again, where is this supported by the specifications?  Maybe you have some insight into the intentions of the specification authors but if those intentions are not reflected in the specifications then they have been lost.  If this was the intention then surely RFC 4287 would have required an absolute IRI or xml:base.

Posted by Ed Davies at

If memory serves, I believe Aristotle has identified the original motivation for rel="self", effectively a workaround for agents like browsers that hand over the document to other processing (like PDF viewers), without retaining a reference to the source URI.

While most of the time xml:base may be interchangeable, I think there is utility in having a separate Atom-specific element. The spec says this “is the preferred URI”, which may not be the same as the current retrieval URI or xml:base (think mirrors or Feedburner). Possible alternatives would be to include the back references in the HTTP headers or use RDF tools to provide richer descriptions, but these seem less straightforward now rel="self" is in an RFC.

If you have payload in another XML dialect (say XHTML), that data may have its own xml:base(s), i.e. its own original context(s). If the payload is to be forwarded, then it’ll be a lot easier to distinguish the intermediary source from the original context, no need to keep a stack or expand all the URIs.

Posted by Danny at

Oops - yes, RFC 4287 4.1.1 does say about such a link in the atom:feed element: This is the preferred URI for retrieving Atom Feed Documents representing this Atom feed.

Posted by Ed Davies at

Atom:link’s are allowed to contain relative references.  We can debate intent all we want but there are no exceptions made for the “self” link.  It may well be the case that there are not any good reasons for a feed publisher to use relative references in the self link, but feed consumers still need to be prepared to deal with 'em if they are used.  So long as the feed is valid, it’s the feed consumers responsibility to preserve the context for “self” links appropriately, just as it’s supposed to do for all other types of links.

Posted by James Snell at

James Snell: who said that feeds that contain relative links aren’t valid?

Posted by Sam Ruby at

who said that feeds that contain relative links aren’t valid?

I haven’t seen anybody say they aren’t valid but Aristotle wrote: The rel="self" link is a protocol for overcoming the inherent loss of context of that interaction. So if you put something in there that cannot be resolved to an absolute URI in absence of context, then there is no point to having a rel="self" link in your feed at all.

He supposes a purpose of self links which would require an absolute IRI in the href or an absolute URI in an associated xml:base, wouldn’t it?  I don’t know if his supposition is supported by the RFC, though.

Posted by Ed Davies at

Sam, I didn’t claim that anyone said that.  What I said is that consumers have to be prepared to deal with relative refs in self links regardless of whether anyone thinks self links with relative refs are useful.

Posted by James Snell at

James, that was my next question.  As I said previously, nobody is disagreeing with either statement.

The relation was designed for a purpose.  There may be other purposes to which it can be put.  If a given link element is used in a way that defeats any known purpose to which this element can be put, it is not an error, and consumers have to be prepared to deal with this as best they can.

Agreed?

Posted by Sam Ruby at

Agreed.

Posted by James Snell at

Ed:

I didn’t say relative URIs are invalid; by a literal reading of the spec you are fine, even if what you are doing is completely useless. Just consider what this makes you.

It has been said that given enough implementations, all SHOULDs in a spec will eventually come to be treated as MUSTs or MAYs. Given today’s software landscape I believe the SHOULD dictate for a rel="self" falls in the latter category; unless a convincing argument can be given that I am doing anyone at all a disfavour, I will be removing the rel="self" links from all feeds I generate and will not be adding them to future feeds.

Posted by Aristotle Pagaltzis at

I will be removing the rel="self" links from all feeds I generate and will not be adding them to future feeds.

Aristotle: amusingly, the same blog post you point to has a word for that behavior too.

What would an angel do?  Write an errata I suppose.  But I suspect that Mark is prophetic on this matter too.

As it stands now, a feed can contain a relative IRI reference which resolves differently based on subtle variations on how the URI which fetched the feed was constructed (variations in case, percent encoding, trailing slashes, query arguments, fragment identifier, and — my personal favorite — presence or absence of a www. prefix.  And each variation would have a legitimate claim to being the “preferred” one.

Like James Holderness suggested, I would support an errata that removed the SHOULD for the presence of rel="self" as an immediate child of atom:feed elements.  I would also support adding a requirement that indicated that link elements that specify rel="self" and are immediate children of atom:feed elements SHOULD be resolvable to an absolute IRI based solely on the information inside the feed.

And preferably both.

Posted by Sam Ruby at

Amusingly, the same blog post you point to has a word for that behavior too.

Touché. :-)

And preferably both.

+1

Actually, for me that SHOULD is creating an issue that has always bugged me: as my content is statically generated and gets rsync’d to plasmasturm.org from a staging setup on my own machine, I can’t vary the URI depending on whether I’m looking at my staging site or the public website and so end up pointing at the feed on the public site even when the feed is only on the staging site. Thus my moronic(-seeming?) decision.

Posted by Aristotle Pagaltzis at

[As a completely off-topic parenthetical nitpick, “errata” is a plural noun, and its singular is “erratum”.]

Posted by Aristotle Pagaltzis at

For those of you that think this is all theoretical BS, let me give a real world example of how this will affect users.

Tim’s website has a little green feed icon at the top right of the page that links to his feed. If I’m viewing his website in an older browser (e.g. Internet Explorer 6.0) I can click on that icon and have it auto-subscribe in my registered feed reader. This is not theoretical - I’ve just tested it now and it works. However, if his feed were using his new mod_atom software (which doesn’t include an absolute self link), the auto-subscription would fail. Not because the feed reader is buggy and crappy as James Snell seems to suggest, but because there isn’t anything else they can do.

Bottom line: not including an absolute self link just makes it more difficult for users to subscribe to your feed. But if you don’t care about people using older browsers, or don’t care about your users at all, then I guess this isn’t a problem.

Posted by James Holderness at

Actually, for me that SHOULD is creating an issue that has always bugged me

the SHOULD itself is creating an issue?  Try reading section 3 of RFC 2119.

Thus my moronic(-seeming?) decision.

There is a subtle but important distinction between there being valid reasons in particular circumstances to ignore a particular item and therefore choosing a different course after understanding and weighing the full implications; and arguing that the spec is ambiguous, or misleading in some way, or ignorable because nobody else implements it, or simply wrong.  Mark identifies with another label the specific type of moron that does the latter.

Posted by Sam Ruby at

...even if what you are doing is completely useless.

A relative self link to another feed is not useless: it says that the other feed is the preferred one.  Therefore, as Tim originally asserted, a relative self link to this feed (i.e., one with an empty href attribute) is useful to the very limited extent that it says that, yes, this is the preferred version of this feed.  Also it’s a rather asshole way of complying with the RFC as it is worded.

It’s a bit difficult to understand why RFC 4287 says that the atom:feed element SHOULD contain this link.  If it was intended that implementations would rely on it, why not say MUST?  If it was to be used in the cases where the context was lost, then why not say it has to be resolvable without the context?

Yes, maybe the spec should be fixed - though I’m not sure it’s critical.  The consequence of not doing so is that some feeds may have one element which is not terribly useful and feed reader software will need to remember, somehow, where it got the file from (aren’t there other formats which require this, anyway?). 

Until then any feed which complies is good.  Any software which relies on feeds being more tightly constrained is, to some extent, broken.  Wasn’t having a feed format where writers could depend on the spec, rather than having to reverse engineer the behaviour of many readers, the point of creating Atom in the first place?

Posted by Ed Davies at

A relative self link to another feed is not useless

If the intent is to have multiple URIs returning the same feed, and the desire is to express which of these several feeds is the preferred one, a relative IRI reference is generally not the way to do it.  A concrete example: www.tbray.org and tbray.org both provide access to the same resource, but one is clearly preferred.

Wasn’t having a feed format where writers could depend on the spec, rather than having to reverse engineer the behaviour of many readers, the point of creating Atom in the first place?

Modulo one potential new, minor, errata^hum, I believe Atom does a pretty good job.

Posted by Sam Ruby at

Modulo one potential new, minor, errata^hum, I believe Atom does a pretty good job.

I wasn’t suggesting otherwise.

Posted by Ed Davies at

It’s like I don’t even need to be here.

Posted by Mark at

However, if his feed were using his new mod_atom software (which doesn’t include an absolute self link), the auto-subscription would fail. Not because the feed reader is buggy and crappy as James Snell seems to suggest, but because there isn’t anything else they can do.

No, the reader can’t do anything with it because the appropriate context is not preserved.  It’s not the relative uri that’s causing the problem, it’s the inability to process them correctly... which is the clients fault, not Tim’s.  That said, if Tim wanted to make things easier, he really should be using xml:base.

Posted by James Snell at

which is the clients fault, not Tim’s.

Sorry, no.  It takes two to interoperate.  SHOULD simply means SHOULD, and those that deviate from this are making a conscious decision away from interoperability with some less-featured products in favor of some other criteria.  For example, Aristotle apparently feels that being able to relocate his server easily is more important to him than interoperating with IE 6.0.  Similarly, if Tim wants to insert a meticulously spec-compliant, but utterly useless element into his feed, he is welcome to do so.

Where I differ is when blame is attempted to be applied to the victims for not being able to implement a feature that would have been enabled if the element had a fully resolvable href attribute given only the information available in the feed.

Consider the IE 6.0 subscription use case that James Holderness mentioned abovel.  In this scenario, the document is forwarded onto an aggregator.  While it is true that such a feed reader would not be able to interpret all of the relative URIs in such a feed correctly, the recipient of this feed isn’t attempting to do this.  It merely is trying to extract the URI of the feed from the information it has available.  Once the subscription is made, subsequent processing of the feed needs to be done with the full knowledge of how to resolve relative URIs.

For them to do otherwise would clearly be a bug.

Posted by Sam Ruby at

James:

If I’m viewing his website in an older browser (e.g. Internet Explorer 6.0) I can click on that icon and have it auto-subscribe in my registered feed reader. This is not theoretical

Well, in my case, it is theoretical, because I’m serving application/xhtml+xml, so no IE 6 users can view the site in the first place. (I’m not keeping them away out of any principled stance, but simply because I run plasmasturm.org for my enjoyment and there are enough unrewarding-to-overcome issues to support IE 6 that my motivation has never been sufficient to tackle them.)

Sam:

There is a subtle but important distinction between there being valid reasons […] and arguing that the spec is [bad or ignorable]

Wait, you lost me – are you saying I fall in the latter category or exempting me from it?

Posted by Aristotle Pagaltzis at

Where I differ is when blame is attempted to be applied to the victims for not being able to implement a feature that would have been enabled if the element had a fully resolvable href attribute given only the information available in the feed.

Perhaps you missed the part where I said “if Tim wanted to make things easier, he really should be using xml:base.”  The blame is squarely on the inability to properly implement the specification.  If I produce a feed that is spec compliant and attempt to read that feed with software that only implements part of the spec, I should not be surprised to discover that certain things might not work as expected.  Yes, one solution is to change the feed so that things will work, and if the feed producer is willing to do so, then great.  If they’re not (or can’t) then the client just has to deal with it.

Posted by James Snell at

Aristotle:

Well, in my case, it is theoretical, because I’m serving application/xhtml+xml, so no IE 6 users can view the site in the first place.

Firefox 1.5 users can still view your site, but they also won’t be able to auto-subscribe via the feed link at the top of your page.

James:

software that only implements part of the spec

WTF!?! Which part of the spec am I not implementing? I cannot auto-subscribe to Tim’s mod_atom feed so obviously I’m doing something wrong. Please tell me what that is. Ideally point me to the magical spec that tells me how to suck Tim’s full feed URI out of thin air.

I get that Tim’s feed is completely valid. I get that he may not care that I can’t auto-subscribe. But, please, FFS, stop telling me that it’s my fault.

Posted by James Holderness at

RFC 3986, Section 5.1:

The term “relative” implies that a “base URI” exists against which the relative reference is applied... relative references are only usable when a base URI is known.  A base URI must be established by the parser prior to parsing URI references that might be relative.

Section 5.1.1:

Within certain media types, a base URI for relative references can be embedded within the content itself so that it can be readily obtained by a parser.

Section 5.1.2:

If no base URI is embedded, the base URI is defined by the representation’s retrieval context.  For a document that is enclosed within another entity, such as a message or archive, the retrieval context is that entity.  Thus, the default base URI of a representation is the base URI of the entity in which the representation is encapsulated.

Section 5.1.3:

If no base URI is embedded and the representation is not encapsulated within some other entity, then, if a URI was used to retrieve the representation, that URI shall be considered the base URI.  Note that if the retrieval was the result of a redirected request, the last URI used (i.e., the URI that resulted in the actual retrieval of the representation) is the base URI.

Section 5.1.4:

If none of the conditions described above apply, then the base URI is defined by the context of the application.  As this definition is necessarily application-dependent, failing to define a base URI by using one of the other methods may result in the same content being interpreted differently by different types of applications... A sender of a representation containing relative references is responsible for ensuring that a base URI for those references can be established.  Aside from fragment-only references, relative references can only be used reliably in situations where the base URI is well defined.

As far as I can tell, Tim’s feed meets all of these requirements in that the base URI for his relative references can be established by using the URI used to retrieve the feed.  However, because IE6 fails to preserve that information, your software cannot work properly.  The blame falls on IE6, which does not maintain the context when it sends the representation to your reader.  If your features are dependent on IE6, then your features will be limited to what it is, or is not, able to do. As I’ve said, Tim could choose to be helpful and include xml:base in his feed, or change to using an absolute URI, but he is under no obligation to do so.

Posted by James Snell at

However, because IE6 fails to preserve that information, your software cannot work properly.

IE 6 predated Atom; Atom link rel="self" was designed explicitly for this situation; Tim’s feed was conforms to the Atom specification, yet somehow fails to interoperate; and somehow that’s IE’s fault?

I think not.

Posted by Sam Ruby at

It’s a well known fact that IE6 does not preserve the appropriate context. Relying on IE6 to enable a feature that can be dependent on the preservation of the base URI is going to be problematic at best.  If Tim is willing to help you work around that limitation, then great, if not, oh well.

Posted by James Snell at

I feel like I am talking to a politician.  Question: are you going to raise taxes?  Answer: family values are the bedrock of America...

Each time these questions are asked, the subject is changed.  IMHO, the spec has a bug in it.  It recommends more than it should, and fails to recommend that which actually would be useful in this situation.  This isn’t an “OMG, the whole spec is busted” kind of bug, but a minor defect that affects few and somehow escaped notice in the time it took us to create the specification.

Whether it is worth the effort to fire up the IETF machinery to address this is a separate question; but denying that the bug exists and blaming the innocent are two things that I don’t want to be a party to.

Posted by Sam Ruby at

Could the spec be changed to make things better? Probably.  Is the possible that such a change would happen any time soon?  Unlikely.  In the meantime, the problem still exists and the solution remains the same: either convince Tim to change his feed and/or come to terms with the fact that certain clients — in particular those that are based on software that pre-dates Atom and has known limitations — will be incapable of realizing the full benefits of the spec.  If and when the time comes to update the Atom spec, I’ll be among those voting +1 to reform the requirements for the self link.

Posted by James Snell at

This is probably a troll and Sam should feel free to delete it BUT there are folks on this thread that are exhibiting borderline sociopathic behavior.

Specs do not exist in a vacuum. The fact that you can obey the letter of the law while still contravening it’s purpose and the entire spirit of said law is not something to be proud of.


either convince Tim to change his feed and/or come to terms with the fact that certain clients — in particular those that are based on software that pre-dates Atom and has known limitations — will be incapable of realizing the full benefits of the spec.

Exactly what is this so-called benefit of the spec? The self link is completely useless if one accepts your line of reasoning since applications are supposed to determine the data it contains from out-of-band sources.

Posted by Dare Obasanjo at

The point I am making is simple: Feed producers are not beholden to the limitations of feed consumers.  Feed producers can choose to help work around those limitations, but are under no obligation to do so.  The limitations of the client are the clients responsibility.

Another scenario: Atom depends on namespaces.  Some XML parsers do not support namespaces.  Who fault is it if a client application uses an xml parser that does not support namespaces and they can’t read the feed?

Or another: Atom allows for the use of IRIs.  Not all clients support IRIs.  Whose fault is it if a client tries to treat link href’s as URIs and finds that it cannot process non-ascii characters?

Posted by James Snell at

Dare Obasanjo:

The self link is completely useless if one accepts your line of reasoning since applications are supposed to determine the data it contains from out-of-band sources.

That’s a bit extreme.  In all the self links that I have encountered, Tim’s is the first which can’t be resolved using only the information contained inside the feed itself.

What it takes to make the self link useful is a small bit of additional information, and the recommendation to do so unfortunately did not make it into the spec.  However, the spec is revisable, if we should care to do so.

James Snell:

The point I am making is simple

That point has been made again, and again, and conceded, again, and again, and again.  Need I go on?

Oh, wait.  I asked a question.  I know, the answer will be to reframe the question and to answer the completely different one.  Meanwhile, the original question raised will go unanswered.

Posted by Sam Ruby at

IE 6 predated Atom; Atom link rel="self" was designed explicitly for this situation; Tim’s feed was conforms to the Atom specification, yet somehow fails to interoperate; and somehow that’s IE’s fault?

Isn’t IE 6’s plug-in mechanism broken for any format which uses relative URIs, not just Atom?

Posted by Ed Davies at

Heh, this is getting pointless but to answer the question...

Does the spec adequately capture that requirements necessary to support that design?

Yes.

Does conforming to the spec provide you with the feature that it was designed for?

Assuming folks fully implement the mechanisms necessary to support the feature, then yes.

Posted by James Snell at

Progress is relative

Suggested change to [link] Append: While using relative references with the self link is perfectly valid; there are clients that are currently incapable of processing such links properly because of...

Excerpt from snellspace.com at

Isn’t IE 6’s plug-in mechanism broken for any format which uses relative URIs, not just Atom?

RSS 2.0, for example, doesn’t officially sanction the use of relative URIs anywhere, but such usages are widespread.  The problem compounds when such feeds are used with a service like FeedBurner which relocates the feeds.  The FeedBurner solution is to add a atom link with a rel="self".  Some tools have picked up on this and use this information to resolve relative references in feeds.  Fortunately, FeedBurner inserts an absolute reference.

Assuming folks fully implement the mechanisms necessary to support the feature, then yes.

You are probably too young to remember Lily Tomlin’s bit where she portrayed a phone operator who would ask the question “This is the operator. Am I speaking to the party to whom I am connected?”

In any case, what you are saying is that every tool that has access to all the information necessary to resolve a same document reference (i.e. the set of tools that do not need the self link in the first place) can use the information they already have to confirm that they are looking at the document that they have already received.

And in the process, you’ve skipped the first question.  It is abundantly clear that the original use case which motivated the creation of this feature in the first place was to address the needs of tools which do not have the information necessary to support the feature as used in the manner that Tim’s feed does.

Posted by Sam Ruby at

It is abundantly clear that the original use case which motivated the creation of this feature in the first place was to address the needs of tools which do not have the information necessary to support the feature as used in the manner that Tim’s feed does.

Which brings us back to the two points I made at the beginning of this post

The spec adequately addresses the problem but is subject to the law of unintended consequences — namely that, in certain situations, it’s possible to apply the solution without actually solving the problem.  I don’t consider that to be a spec bug although the spec can certainly be changed to make things better.

Posted by James Snell at

The spec adequately addresses the problem but is subject to the law of unintended consequences

Translation to English:

“The spec adequately addresses the problem, except in the cases where it doesn’t.”

Posted by Aristotle Pagaltzis at

rel="self" does not belong in errata.

The WG put it in on purpose, with the exact definition that appears in the document. It is an atrocious design error, and that was pointed out at the time. I guess those people should have written more follow-up emails, or something. You would need to obsolete RFC 4287 to make it go away.

Posted by Robert Sayre at

In case anyone is interested, PaceFeedLink was the original Pace which introduced the concept of a self link. If you bother to read it, you can see the rationale, as well as the original recommendation that the href be absolute.

Posted by James Holderness at

Digging into the archives:

The rel="self" thing was introduced in draft-ietf-atompub-format-06 with this text:

atom:feed elements SHOULD contain one atom:link element with a rel
attribute value of "self" and SHOULD contain a href attribute with
an absolute URI as its value.  This URI identifies the feed and a
representation equivalent to the feed.

The very next edition (draft-ietf-atompub-format-07) it now appears as this:

atom:feed elements SHOULD contain one atom:link element with a rel
attribute value of "self".  This URI identifies the feed and a
representation equivalent to the feed.

That expression remains for all the subsequent editions.

The change notes for -07 has only this:

Change atom:source-feed to atom:source.
Add ABNF reference
Many editorial tweaks
Rework extensibility
Adjust page breaks in txt version

I would consider the change of allowing relative URIs there more than just an editorial tweak, and I don’t see any Pace listed where consensus was sought. I found 272 messages on the Atom Format WG list between -06 (March 14 2005) and -07 (March 31), but the only mention of rel="self" was in a thread questioning why rel="alternate" was MUST be present and was pointing out that rel="self" was a SHOULD for presence (furthermore, that discussion occurred after the March 31 date on -07 and it’s arrival on the list on April 4).

I can’t find any discussion of relaxing the “SHOULD contain a href attribute with an absolute URI” requirement. If there were any egregious errors, then this more-than-editorial removal of a SHOULD requirement was it. That the WG failed to notice this error is unfortunate too, of course.

Posted by Eric Scheid at

Well, if the original agreed upon language magically dropped out of the spec, then I would definitely agree that there’s a spec problem. 

“The spec adequately addresses the problem, except in the cases where it doesn’t.”

It’s more a question of cases in which developers choose to build on technologies where the solution can’t be adequately applied.  In any case, my opinion of whether it’s a bug or not is  actually quite irrelevant.  The more important issue, as Rob as points out, is that the spec is done. We either write up a new RFC to fix it or learn to deal with it.  It’s not important enough to warrant a new RFC. Anyone want to start working on a BCP to document these kinds of issues?

Posted by James Snell at

Uh, actually the problem is with rel="self". It doesn’t work, whether the URI is absolute or not, because it introduces interesting new opportunities to subscribe to the wrong thing.

Regarding the absolute requirement, I don’t have my mail from back then. But reading the PaceFeedLink thread, I see that some people counted as in favor didn’t want the absolute requirement but were otherwise ok with it, so perhaps there was a message pointing that out that should have been public. But anyway, other messages over the course of 2005 revisited the issue, so the WG had plenty of time to change that text.

Posted by Robert Sayre at

“interesting new opportunities to subscribe to the wrong thing”

The link[@rel='self'] is content as published by the owner/publisher of that feed. The worst that might happen is that you might slam all your subscribers onto someone else, which is not all that different in effect from you deciding to henceforth publish all manner of drivel ... subscribers that don’t like the new content will unsubscribe. Having said that ... there is the interesting possibility of the owner of a heavily subscribed feed using that situation for a one-chance-only denial of service attack. You’d be better of effecting a DDOS via leeching graphics referenced in your feed though.

Any other interesting possibilities?

Posted by Eric Scheid at

Sam:

The FeedBurner solution is to add a atom link with a rel="self".  Some tools have picked up on this and use this information to resolve relative references in feeds.

I forgot to ask this earlier, but did you have a particular tool in mind? I’ve seen a lot of different things used as the base URI in RSS feeds, but I’ve never seen an aggregator use an atom self link. If there are aggregators doing that, I’d love to know about it - one more technique to add to my collection.

Eric:

The worst that might happen is that you might slam all your subscribers onto someone else

If you wanted to do that, surely you could just use an HTTP redirect? Far more likely to work too. I doubt there are many aggregators (if any) that interpret a self link as a 301 redirect. At best you could force new subscribers onto someone else, but that’s not much of slam in comparison.

Posted by James Holderness at

I forgot to ask this earlier, but did you have a particular tool in mind?

It’s a bug, and I have talked to one of the developers and he indicated that they will fix it — at least for Atom — but one of the bigger web based aggregators that Tim listed as failing to consume his feed properly does this now.

Posted by Sam Ruby at

Add your comment