It’s just data

RDFa in HTML

A number of efforts seem to have sprouted up in the past few days:

Undoubtedly, this will be controversial.  Some will minimize the actual and real problems that this proposal presents.  Others will blow minor details out of proportion.  To both of these, my attitude is the same: this too, shall pass.  What will remain will be both messy and useful.

Meanwhile, what appeals to me is a greater irony.  This effort is an offshoot of a decision that (with the full benefit of 20-20 hindsight) demonstrated a great deal of hubris in the far distant past.  Some will say that the epitome of this is XHTML2 which attempts to solve many of the same use cases as XHTML1, but starts with a clean slate.  An effort to which the browser vendors expressed a collective meh.  Time, invention, implementation experience, deployment, and feedback from the likes of Creative Commons, Yahoo!, and Google occurred.  And we are now seeing the messy and useful results.

Meanwhile, time, invention, implementation experience, deployment, and feedback from the likes of Apple, Opera, and Mozilla also occurred.  Messy and useful results were produced.  Recently, this produced an offshoot to solve the same uses cases as RDFa, but with a clean slate.

I’m reminded of the words of Lawrence Lessig:


The problem with RDFa is that the web is inherently messy. Content producers are not even able to produce well formed XML, so letting them work with bare-bones semantics will be a disaster. Look at what Google did with RDFa. I’d rather give the content producers some messy microformats, where you can’t do much wrong. It will be harder to extract the correct RDF from that, but the result will at least be proper RDF.

Posted by Sjoerd Visscher at

Sjoerd, what is wrong with Google’s RDFa? The basic RDF data model is very simple, URI, another URI, and a third URI or primitive datatype. Not much to get wrong there, and if you get something wrong, it will either produce nothing, or produce something that has no meaning and is thus ignored (but can be salvaged if you wish).

E.g. Google’s case of the URIs missing a / will still produce triples, and those triples are not ‘wrong’, well a little bit, but not really. They exist but are not used, because they do not match any known URIs. Additionally, Google’s stuff has been released only for a short period of time and you need to give them opportunity to fix the errors in response to peer review.

Note that the Google documentation and examples are actually riddled with errors, including errors in the microformats (e.g. ‘fn’ versus ‘name’) and in the HTML (“<span<strong”). Yet all the FUD seems to be focusing only on the errors in the RDFa.

If content producers can mess up RDFa statements, then so can they mess up microformats statements, and thus any RDF extracted from them. Semantic errors (e.g. a misassigned property) can be expressed just as easily in microformats as they can be in RDF.

Keep in mind that RDF does not say that you have to trust everything that is on the web. You can restrict the domains you pull data from trusted domains who are known to produce quality data, and/or analyse/filter the data you receive from untrusted domains to improve their reliability. This is how RDF deals with incorrect data.

Also, non-RDF data may not be good enough, e.g. in a tagging-of-sorts microformat, two tags with identical names but distinct meaning can likely not be identified as separate, whereas in RDFa, they can (provided the origin site has this knowledge). For example a pet peeve of mine wrt tags, if only Youtube knew the difference between the ‘MSX’ 80s home computer and the ‘MSX’ counterstrike clan, that would be swell.

Posted by Laurens Holst at

And just to illustrate that microformats are not infallible either, this nice link: [link]

Posted by Laurens Holst at

The RDFa+HTML lightweight profile is produced by the help of the result of 1998: XHTML. For example, it links in the XHTML vocabulary , that was defined by the XHTML modularistion spec - and which again builds on the profile concept in HTML 4.

HTML 4 is less extensible than XHTML is. Yet, HTML 5 removes the extensibility mechanisms (DTD and @profile) that has allowed the RDFa+HTML lightweight profile to be created.

So, yeah, much irony here. But I’m not sure we agree what the irony is. Or which body that is the most worthy the hubris stamp.

Posted by Leif Halvard Silli at

RDFa in HTML | Sam Ruby

[link]...

Excerpt from Delicious/jimwhite at

Sam Ruby: RDFa in HTML

Sam Ruby: RDFa in HTML Thu 14 May 2009 at 19:48A number of efforts seem to have sprouted up in the past few days:RDFa for HTML AuthorsRDFa ProfilesRDFa in HTML: a lightweight profileUndoubtedly, this will be controvers... vantguarde RDFa , HTML...

Excerpt from vantguarde / HTML / RDFa (20) at

Add your comment