Quantifying the "RDF tax"
There have been assertions of an "RDF Tax". Not having an opinion on the subject, I decided to do a little investigating. In particular, I sought to identify the highest potential "ceiling" to the RDF tax.
So, with the help of a number of people on IRC (in particular, Mark Pilgrim, Ken MacLeod, Shelley Powers, and Sean Palmer), I developed an XSLT transform from the Atom 0.2 snapshot into the most comprehensive RDF equivalent. Some argued for me to simplify in the process... but the methodology I wanted to apply here was first to seek to understand, and only then to simplify.
The initial results are that the maximal Atom 0.2 snapshot balloons from 47 non-blank, non-comment lines to a whopping 61 lines. And that is before simplification. Clearly, some readibility was lost in the process. This, too, needs to be addressed.
A number of observations:
- leaving the XSLT as an "exercise for the student" ends up sweeping a lot under the rug. This was hard, and I'm not sure I'm close to being done.
- this effort provided an alternate insight into this data, which surfaced a number of questions I never pondered before. For example: is the order of contributors significant? This needs to be answered and documented.
- a number of possible synergies also arose. For example, independent of RDF, might it make sense to re-use FOAF terminology?
Once we have a sufficient understanding of the "proper" way to apply RDF, then we can move on to exploring the " practical" way to apply RDF.