intertwingly

It’s just data

Web3S


M. David Peterson: WebW3S is Microsoft’s answer to a RESTful web publishing protocol. In many ways it attempts to tackle the same problems solved by the Atom Publishing Protocol.

I took a look at “Web Structured, Schema’d & Searchable”, and found Structure, but was unable to find the Web, Schema, or Search.

But let me first back up.

The web on which the Atom Publishing Protocol concerns itself consists of resources which may be binary (things like GIFs and JPEGs), markup (things like HTML and XHTML), and arbitrary XML; and furthermore may contain outbound links to resources which can reside either on the same host or on different hosts.

The data that Web3S concerns itself with consists of element information items (EIIs) which, in Web3S at least, must always form acyclic single rooted trees.  EIIs have a name, an ID, a parent, and zero to many children.  EIIs are explained in terms of the XML Infoset, though apparently serializing the Web3S Infoset into/out of JSON is an open issue (3SAFQ), as for that matter so is XPATH (3SAFO) and SEARCH (3SAFP).

While I see little in this document that relates either to Atom or APP, my read is that if Web3S were recast using FOAF and SSE, it would be a home run. 

Or LDAP, as Tim points out.

More details below.

Mapping to the Web

The mapping of a self-contained acyclic self rooted tree onto a portion of URI space is via a “Non-Web3S Prefix Path” which is defined thus:

The part of a HTTP URL path that points to a Web3S root EII. For example, if the root EII com.example.lists is addressable as http://example.com/web3sparser/joesstuff/com.example.lists and com.example.lists is a Web3S resource then web3sparser/joestuff is the non-Web3S prefix path.

A more complete example can be found in Example 3:

DELETE /someuser@example.com/LiveContacts/com.example.addressbook/com.example.contact(123ABC)/com.example.phones/com.example.phone(9993) HTTP/1.1
Host: cumulus.services.live.com

So from this, we can conclude that on a host named cumulus.services.live.com, rooted at /someuser@example.com/LiveContacts is a Web3S acyclic single rooted tree.

Schema

Section 3 disclaims any relation of the term “schema” as described by this document has any relationship with any existing schema language in this manner:

the actual representation of the schema (if any) is not constrained by this spec. (Read: No, this has nothing to do with XML Schema.)

That being said, the infoset is sharply constrained:

All elements in the Web3S infoset are named using reverse domain names. So the ‘proper’ name of the root element is com.live.livecontacts.addressbook. To make it easy to serialize into XML we split the name such that the last segment becomes the XML localname and the rest of the DNS name, prefixed with the protocol identifier Web3SBase becomes the namespace.

Again, an example to illustrate:

<Contact xmlns=”Web3SBase:com.example”
	 xmlns:Web3s = “Web3S:”>
   <Web3s:ID>43432</Web3s:ID>
   <Profiles>
      <Personal>
	 <FirstName>Manish</FirstName>
      </Personal>
   </Profiles>
</Contact>

Update/merge

While search and JSON are planned future enhancements, boxcarring has made it into the spec in the form of a proposed new HTTP verb: UPDATE.

The UPDATE method allows the caller to bundle “three kinds of changes to a resource – create a value that is not there, update a value that is there and delete a value that is there — into a single request.”  As near as I can tell, all such requests must be scoped to a single Web3s tree.

This document also describes infoset merging; I can’t help but wonder if SSE fits here.  As it stands, Web3S appears to be a single user database; once you accept updates from multiple places the situation becomes a bit more complicated.

Summary

There are two new media types (Application/Web3S+xml and Application/Web3SDelta+xml), two new URI Protocols (Web3S and Web3SBase), and one new HTTP method (UPDATE) defined in this document.

I can find no discussion of binary data, in fact everything seems defined in terms of the XML infoset.  Given that all data needs to be in a namespace, and that all such namespaces need to use a new URI protocol, one can conclude that no existing XML documents can be directly handled by Web3S.

Web3S data is further constrained to be a self enclosed tree.  There is no general concept of a hyperlink in Web3S, neither to external data, nor within a tree.  To traverse this data, one needs to be aware of the specific schema employed by the application.  Adopting either XLink, or some conventions (e.g., xml:base + href attributes), would make this data crawlable.

The data structures that motivated these requirements seem to have much more to do with FOAF (or perhaps OPML) than Atom or RSS.  Both FOAF and OPML are inherently “linky”, distributed, and web like.