Meanwhile, a few days ago
Sean Palmer initiated an
ExtensibilityFramework that just so happens to address a number
of Tim's issues. In the process, a
RelaxNG grammar was produced which covers not only the core
elements, but also all extensions.
The purpose of this extensibility framework is to allow a
uniform mechanism for handing text (with and without
markup)
and detecting URIs. The latter would also help with things
like xml:base
processing.
Instead of reinventing XLink why not just use XLink? I don't see much point using the 'ref' convention when the xlink:href convention already exists and is a standard. Plus, the abiliity to automitically pull out all links in an Atom document would probably be very useful. In vocabularies as 'linky' as Atom I'd think a domain-independent linking scheme would be essential.
'ref' is used in a context where possibly any of 'href', 'id', or 'uri' could have been used.
'uri' and 'id' are probably closer than href/ref, as it's being used more like "this identifies this resource" and "this is the id of the resource I'm talking about."
'ref' is more like XIOR's xior:xoid and xml:id, although XIOR and xml:id make potentially an unnecessary distinction between an id and a reference to that id. In both of those cases, though, they make it explicit that an id can be used only once, on the element where all the information about that id, whereas many references can be made to the same id.
'ref', on the other hand, can appear more than once, and information from many elements with a 'ref' form a union of that information. Because it can appear more than once and because a distinction between resource id and reference is not a necessity, a single attribute was picked.
The discussion between 'ref' and 'uri' is in the IRC log. It appears that the code got changed to 'ref' at one point and no one brought it back up.
Bo, one reason for not using XLink is to not introduce too many namespaces in the Atom core. Though I see the usage for XLinks, I think we should keep the core as clean and simple as possible, and rather implement an Atomified version of XLink (within the Atom namespace) than implementing XLink directly.
Having more than one namespace in the core isn't very user-friendly or beautiful. Also, having different linking-methods in the core (atom:link) and in the extension (xlink) is unappropriate.
Sean, please bring that distinction back. If you want to keep ExtensibilityFramework simple (and I guess you do), make the URI @id rather than [id]. But don't use @ref for both.
Ziv, the distinction between URLs and URIs is not getting "lost", it was discovered that they are not so distinct after all. The IETF and W3C have been working a long time on clearing up what it means to be a URI and/or a retrievable resource, and for the most part I think they're getting it right.
The biggest difference between a URI and what used to be called a URL is based on context. In Atom, it doesn't matter how the URI is represented (element content, a 'ref' attribute, or an 'id' attribute), it matters where it's used. <link> is meant to be retrieved. <id> is not. In the Atom 0.2 snapshot, there's nothing else that indicates that the element content is a URI or URL. 'ref' is the same way.
Ken, in light of this, please explain how a consumer of an ExntensibilityFramework resource can determine which refs are retrievable and which are not.
Ziv: the first part of a URI is a scheme. This is the portion of the URI that preceeds the first colon. Schemes like 'http' are retrievable. Schemes like 'mailto' are not.
Which schemes your application will support retrieval on is up to you. 'ftp' might be a good idea. 'irc' may or may not. 'urn' definitely not.
Schemes like 'http' are retrievable. Schemes like 'mailto' are not.
This is untrue. URI are just identifiers. Notions like whether something is "retrievable" based on its URI scheme are quaint notions from the days of URLs.
Tim Berners-Lee: The Web works because, given an HTTP URI, one can in a large number of cases, get a representation of the document.
Note also the that definition of the href attribute of the A element in HTML 4.01 is in terms of URIs.
Dare, clearly this is an area of intense theoretical debate. I merely would assert that Ziv and others are pretty safe to assume that a HTTP URI is likely to be retrievable.
I would like to amend my previous statement: interpreting a mailto URI by launching your prefered mail client with the To: field pre-filled in (and perhaps the subject) is certainly reasonable.
Sam, I can't tell if you are stating or suggesting that URI schemes that can be used to retrieve a resource are the sole indicator of whether or not a URI used in some context should be retrievable. (I'm in the camp that says the URI scheme is not the indicator.)
are logically equivalent. In both cases, it is the documentation (and/or schema) that says that <link> should be retrievable and <id> only used as an identifier. Going further, the value of the URIs could be the same (http://example.org/blog/4321.html), still serve both purposes (identifier and retrievable resource), and still be dependent on the context to indicate how it should be used.
In EF and RDF, the contexts are also clearly specified. It still is the vocabulary that tells you whether a given URI is intended to be retrievable, but EF and RDF also use URIs as both the subject identifier of resource records and the value of properties of resource records (references to other resource records). RDF uses rdf:about when talking about the subject URI and rdf:resource when talking about the object value's URI, but they could easily be one attribute because it is always clear by the syntax when one is identifying the subject or using a URI as an object value.
EF defines a more compact XML model than RDF/XML (which is one of the reasons people are looking at it). In doing so, it doesn't have the luxury of using two different attributes depending on whether it is being used as an identifier or a reference, consider:
Here we have a feed ('.../feed') that has a generator ('.../genwell') and a generator ('.../genwell') that has a name ("GenWell"). The 'genwell' URI is used as both the property value URI in the feed and the subject URI of the generator resource record. Here is roughly equivalent RDF:
"[...]somewhere along the way the distinction between URLs and URIs got completely lost, which I think is a mistake." -- Ziv Caspi # "the distinction between URLs and URIs is not getting "lost", it was discovered that they are...
Sam, using the scheme to determine whether you have a URI or a URL not only breaks static typing (and, I assume, XSD), it also doesn't work. For example, people who post twice a day might want both posts to have the same URL, but they certainly shouldn't both have the same ID.
Ken, if I understand you correctly, you're saying that to learn whether the attribute in X/@ref is a URI or a URL I need to have some knowledge of X itself which is not part of the document itself (I'm careful not to say infoset, see? :-). This could be some external schema, or have the meaning of all Atom elements "burned" into the processing application itself.
As far as I can tell from the EF proto-spec, however, this doesn't agree with the purpose of EF; namely, to be able to make such distinctions in an extensible manner, without external "help".
Ziv, correct: you must have knowledge of X, which is no different than in the Atom 0.2 snapshot (no relation to EF or RDF), nor in RDF, nor in several other specs.
Are you suggesting there should be a flag or attribute in formats indicating whether a URI is retrievable or not? If I read URI != URL correctly, you are suggesting that, so I'll need to followup there to say why I don't see that as either necessary or applicable here.
Re. EF, I don't see anything in the purpose of EF that needs to know whether a URI is retrievable or not. xml:base, for example, can be used with both identifiers and retrievable resources without knowing which is which.
Ken, I'm suggesting that we have clear differentiation between URIs (such as IDs) and URLs (such as links to various resources). This distinction is important for processors.
Sam, the fact that URL is a subtype of URI means that if you mark all your URLs as URIs, you'll lose functionality. Conceptually, it means that a processor cannot tell if something is retrievable or not (and if it tries to retrieve the resource and fails, whether it is a temporary problem or not).
I'm aware that HTML (4.1, I believe) says that A/@href is URI. In my opinion this is in error. Indeed, browsers ignore this and mark with a hyperlink everything in the A element, even if the system would not be able to oblige when the user actually clicks the link.
The Web works, but we shouldn't take that to mean it's perfect.