It’s just data


It is a name I don’t care for, but alas, one that likely will stick.  The concept is to provide explicit support in HTML for embedding metadata in content.  Both Microformats and RDFa do related things.

As is common in distributed development, things haven’t exactly happened in chronological order. 

Ian abstracted out use casesManu participated, Elias seems happy, and Shelley is continuing to review.

Ian analyzed the use cases and committed the changes to the draft HTML5 spec.  This is an unusual approach for standards organizations, but Commit-Then-Review is a common approach at the Apache Software Foundation.

The very next day, Ian made his first change based on review feedback.  Shelley and Ben expressed similar concerns.  I’m not aware of any changes which have been made since (translation: today!), but the day is still young.  Things clearly are starting to move quickly.

Ben’s concerns clearly go beyond that one attribute, and mostly appear to center on overlap with future plans for an HTML serialization of RDFa (the current Recommendation is for XHTML only).  These plans are rapidly being deployed in what is another chronological anomaly.  As to the question of “who got there first” — if that turns out to be the full extent of the concern — then I agree with Jeni: The web will evolve and there is space enough in it for us all.

Philip has implemented the proposal.  His implementation can produce both RDF triples and JSON output.  Shelley is evaluating.  Her effort is the one that I’m watching most closely.  What I am particularly looking for is any sense as to whether or not there are validated benefits to producers for this syntax as contrasted to the alternatives.  The reason for my focus on producers in this case is that I believe that consumers will, by necessity, be scavengers.  At the moment, I’m assuming that both syntaxes are digestible.  If that turns out not to be the case, then all bets are off.

Shelly is now a member of the W3C HTML Working Group (welcome!), and is looking for suggestions.  As for my input: I’d particularly be interested in real live examples (e.g. from Drupal) of where the annotations that one would like to express are more simply, more concisely, or more accurately can be expressed in one syntax or the other.

Ultimately, it should be possible to reduce these results into a set of criteria as to which format is better for any given use case.  If the results turn out overwhelmingly in one direction or the other, hopefully we will converge on the “better” format.  If the results are mixed, I expect that the HTML5 draft will evolve, and hope that an “RDFa in HTML” draft is produced incorporating this feedback.

Just so it is crystal clear: am I happy that there are multiple syntaxes?  Absolutely not.  But I’m equally unhappy (if not more so) with the continued state of there not being an “RDFa in HTML” draft despite the fact that people are clearly deploying it.  And I note that Elias' initial take seems to be along the lines of “microdata is a simpler syntax and addresses the necessary use cases”.  Hopefully Shelley will soon be in a position to confirm or refute that impression with real data, if she isn’t already.

And should it turn out that Microdata does not enjoy consensus, I will ask that it be removed from the W3C draft.