It’s just data

Queso

Elias Torres: Queso is a J2EE-style application that implements the Atom Protocol specification currently in draft-09 atop an RDF server called Boca (the restaurant’s name is Boca Grande, a.k.a. Big Mouth) using Henry Story’s Atom OWL for the model and of course opening up a SPARQL endpoint to query the contents the store.  All of that of course it is just the beginning, we will be creating more compelling demos that bridge the Semantic Web and Web 2.0 some of which will include RDFa techniques.

Can it do this?

It looks like Elias and friends are working on is a tool that takes feeds and slices and dices them into sub-Atomic particles in a form that is very amenable to ad-hoc queries.  For the moment, they are focusing on the semantically rich and easy to parse subset of the feeds out there: Atom 1.0 with entries that have content with type="xhtml".  Presumably in the future they will expand to support things like TagSoup.

Internally, this data is stored as RDF, which turns out to be a very reasonable choice.  I’m not a fan of the RDF/XML serialization format, but when you have data for which new relationships are being invented seemingly, daily RDF triples makes a lot of sense.  Particularly with data as entwined as this data tends to be.  If the feeds being processed are valid, pretty much every piece of data in the database can be tracked back to who said it, when it was said, where it was said, and to some human readable summary or content.

Seeing that this effort is beginning to mine data inside the content, my first question (above) deals with searching for data in its native format, namely html links.

My next question is somewhat harder, and deals with updates.  Suppose I were to correct a link in a post of mine.  Would the new link get added?  Would the old link get removed?


It appears from this link that updates are treated as totally separate entities.  Or perhaps I’ve misunderstood what this means:

“Because Atom relies on both an ID and a timestamp to establish the uniqueness of an entry, the ID cannot be used as the resource that serves as the subject of statements describing the particular Atom entry. Instead we use a blank node that has <http://www.w3.org/2005/10/23/Atom#id> and <http://www.w3.org/2005/10/23/Atom#updated> properties. All of the triples for a particular Atom id are stored in a named graph that has this id as its name.”

(By the way, I had the hardest time figuring out how to get this post to through, until I added the code marker around the id stuff.  Kept get XML not well formed errors, which I think are errors in your response after it tried to treat the URLs as links...)

Posted by Stephen Duncan Jr at

Apparently I lost the ability to speak English when finishing that post. “...get this post to GO through...” "...Kept getTING XML not..."

Posted by Stephen Duncan Jr at

... now suppose that the update was a mere typo fix, i.e., one that the author does not wish to flag as a “significant” change, so therefore the value of atom:updated isn’t changed.

Even “significant” updates aren’t as easy as they seem.  Tim Bray has a post that he updates most weeks.  The content doesn’t tend to change much most weeks, nor do the links.  However, the images at those links do change with every update.

Posted by Sam Ruby at

Stephen,

No matter how many updates you make to the entry, the URL for self/alternate will always be the same and the atom:id does not change. Our RDF server supports revisioning natively, meaning that if you make any add/delete to the RDF graph we have a new version. However, that is completely independent of our Atom server implementation. James Snell works with us on Queso and we are actively making sure that our semantics  reflect those intended on the specifications. However, many are still being worked out.

Posted by Elias at

Sam,

I’ll soon be posting a SPARQL query that returns the same search results as your bloglines example.

Posted by Elias at

Sam,

My next question is somewhat harder, and deals with updates.  Suppose I were to correct a link in a post of mine.  Would the new link get >added?  Would the old link get removed?

The old link is removed and the new one is added. Basically on every request I convert the Atom XML to the equivalent RDF and replace the entire contents of my named graph for the entry.

Posted by Elias at

SPARQL query:

PREFIX a: <http://www.w3.org/2005/10/23/Atom#> 
 PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 
 SELECT ?href ?title ?body 
 WHERE { 
   GRAPH ?g { 
       ?entry a:link ?link ; a:content ?content ; a:title ?t . 
       ?link a:rel "alternate"^^xsd:string ; a:href ?href . 
       ?content a:body ?body . 
       ?t a:body ?title . 
   } 
 }

Visit Queso Atom Browser, click on the ‘Graphs’ tab and uncheck any boxes there. Paste the query in the box in the ‘Query’ tab and click on ‘Get Results’. Of course, you may filter with a specific link like in your blog post by adding a filter statement after the last graph pattern in the query above.

PREFIX a: <http://www.w3.org/2005/10/23/Atom#> 
 PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 
 SELECT ?href ?title ?body 
 WHERE { 
   GRAPH ?g { 
       ?entry a:link ?link ; a:content ?content ; a:title ?t . 
       ?link a:rel "alternate"^^xsd:string ; a:href ?href . 
       ?content a:body ?body . 
       ?t a:body ?title . 
       FILTER ( str(?href) = "[insert URL]" )
   } 
 }


Posted by Elias at

[from dagoneye] Sam Ruby: Queso

[link]...

Excerpt from del.icio.us/network/manuel at

[from manuel] Sam Ruby: Queso

The old link is removed and the new one is added. Basically on every request I convert the Atom XML to the equivalent RDF and replace the entire contents of my named graph for the entry....

Excerpt from del.icio.us/network/gavin at

The Atoms! They SPARQL!

It’s cool, and it’s buzzword-compliant. It’s cuzzword-compliant! Err, no, wait. It’s cool. Let’s leave it at that....

Excerpt from International Man of Transparency at

Why I don’t like to infer semantics from tags

del.icio.us is a popular referring source to my weblog. I noticed quickly that one of the tags being used was ‘queso‘ for obvious reasons. Silly me to think that since most of the del.icio.us audience is technical in body and mind, the...

Excerpt from Elias Torres at

Practical SemWeb Outreach

Speaking of Elias , it would be remiss of me not to mention he’s done it again. The SparqlCalendarDemo he and Lee Feigenbaum put together was a great demonstration of Ajax+SemWeb. Now he, with Wing and Ben (great write-ups!) are covering Atom...

Excerpt from Planet RDF at

Add your comment