intertwingly

It’s just data

Sifting for Metadata


Elias Torres: Atom undoubtedly will be the format and API of choice for all these content types, but its design was to be the minimal amount of metadata to communicate information and not a rich semantic framework to express it all.

Along the way, Elias notes the irony in that the output of a SPARQL query is not RDF/XML.

I agree that Atom isn’t intended to be a rich semantic framework, much in same way that HTTP was never intended to be a highly advanced distributed object system.  I’ll also note in passing that RDF is multi-faceted.

Enough theory, let’s take something a bit more concrete.  Look at this feed.  If this feed were expressed in RDF/XML, how would you express the following as a SPARQL query?

//atom:entry[//xhtml:a[@href='http://feedvalidator.org/']]

In English, this amounts to finding all entries which contain a link to the Feed Validator.

My theory is that most of the interesting metadata is in the content.  That’s the essence behind impetus towards microformats.  It’s what makes Google work.  It’s what makes Technorati, Feedster, Bloglines and any number of other search engines work.