It’s just data

CMIS

Roy Fielding: All of those points are rather small compared to my overall complaint that it isn’t appropriate to define a “REST” binding to a specific data model’s limitations. The whole point of REST is to avoid coupling between the client applications and whatever implementation might be behind the abstract interface provided by the server.

First, by way of disclosure, I had an opportunity to provide some input several months ago as this was being developed, and a number of changes were made based on my input.  I also chatted with Al Brown last night.

I also chose to be willfully oblivious to any hype.  I have no comment on any complaints that have been made along those lines.

What matters most to me is not how they derive or express this specification, but on whether the operational behavior is such that a pure HTTP client can fully participate up to the limits of the HTTP specification, and AtomPub clients can participate to the limits of the AtomPub specification.  By that I mean that extensions are fine, if they are truly optional, e.g., an AtomPub client which is otherwise unaware of CMIS would be able to traverse collections, and fetch, update, and delete resources.

Based on my discussion with Al last night, I’m cautiously optimistic that this will be the case.  I looked at specific instances of service documents and feeds before I came to this conclusion.  Furthermore, they were open and responsive to my feedback, and I believe that their invitations for others to participate to be sincere.

Roy’s point that HTTP headers that affect the representation returned without a VARY header will poison caches is valid, and represents a bug.  I’ll point out that adding such a VARY header could very well cause caches to be less effective.  But, again, I view that as a simple bug, one that the TC intends to address, and one that isn’t overly surprising or a cause for concern at this early stage of standardization. 

Certainly neither HTTP nor AtomPub standardize query, as such I do not see it as a problem if a new media type is introduced for this use case.  Perhaps there might be a better way, and if so, it should be pursued, but otherwise a new media type is a perfectly acceptable solution.

As I’ve done with numerous other Atom extensions, I plan to work with this team to add support for CMIS to the Feed Validator.


Roy’s point that HTTP headers that affect the representation returned without a VARY header will poison caches is valid, and represents a bug.  I’ll point out that adding such a VARY header could very well cause caches to be less effective.

The poisoning that occurs when leaving out a VARY header will cause real problems for users behind correctly-performing caches.

The caches that are less effective in the case of a present-and-correct Vary header (such as IE6’s internal cache) are broken, but also make up a large number of real users.

In light of those facts, I believe that Roy’s suggestion of simply using different resources with different URIs for such varying representations is a pragmatic means of solving the problem.  It looks like the TC may well go in that direction.  Hooray.

Posted by Justin Sheehy at

De-hyping CMIS

This week has seen REST experts Roy Fielding and Sam Ruby comment on CMIS. As someone directly involved in CMIS, I wanted to acknowledge both Roy’s remarks and Sam’s remarks, which follow onto Roy’s. The standards effort based in...

Excerpt from Craig's Musings at

I think that constructing an interface that can be maximally used by standard clients is a good first step and one that can be done by any ECM vendor without need for an additional standard beyond HTTP, WebDAV, AtomPub, and the various hypertext-aware media types. However, the second step is not to make CMIS-specific protocols, data formats, and relationships that are spread all over the data willy-nilly.

We are talking about generic web-based content management using universally accepted notions of folders, documents, and (at least fairly common) versioning operations. They deserve equally generic typed relationships and standard client behavior that doesn’t rely on out-of-band information and model-specific protocol tweaks. Having a lot of little standards only works well when they don’t overlap.  Imagine, for example, how large your Atom feed will become if every possible data model has its own namespace and unique set of relationship names that need to be linked within every entry.

The focus should therefore be on enhancing the media types and relation types with those universal concepts in a way that will still work as a subset for more advanced back-end data models. If we need a query interface, then it should be designed as an Atom query construction mechanism within the Atom format. It should be defined by the interface presented across HTTP, not by the interface underneath the back-end.

Posted by Roy T. Fielding at

Sam Ruby on CMIS

submitted by gthank [link] [0 comments]...

Excerpt from programming: what's new online at

Hi Sam,

I will definitely hold most of my comments for the appropriate time in the OASIS TC
but I quickly wanted to check with you on something.

Certainly neither HTTP nor AtomPub standardize query, as such I do not see it as a problem if a new media type is introduced for this use case.  Perhaps there might be a better way, and if so, it should be pursued, but otherwise a new media type is a perfectly acceptable solution.

I see query as a safe operation, so I really believe that this is one of
the few things that are better done with GET rather than POST. I see a lot of
value (and no harm) in something like an easy to use simple /myresource?q=<query>
as it is employed by a lot of real-life search-engines today.

What are your thoughts on that (beyond the somewhat obvious real-life limitations of
URL length and the fact that using the currently proposed query language may not
look all that pretty in urls)?

regards,
david

Posted by David Nuescheler at

If you look at aa.com, it defines a number of queries that are done as POST.

The tradeoffs between GET and POST for a query tend to revolve around whether the expected parameter size can reasonably be encoded in a URI; whether enabling people to bookmark and pass around canned-queries in IM sessions and documents is something you want to encourage or discourage; and whether caching is likely to produce benefits or merely additional overhead.

In general, yes, such tradeoffs favor GET over POST for query; but there are exceptions.  Even with the small amount of data involved in aa.com queries, the queries include date ranges so are unlikely to be reused, and given the overhead required in processing such a query, are not something that one would want to encourage bookmarking or repeated re-fetches.

Posted by Sam Ruby at

Of course if GET can’t be used for a query, there are other IETF standards track methods that could (REPORT & SEARCH)...

Posted by Julian Reschke at

The tradeoffs between GET and POST for a query tend to revolve around whether the expected parameter size can reasonably be encoded in a URI [...]

I don’t agree with this at all.

In general, yes, such tradeoffs favor GET over POST for query; but there are exceptions.

The exceptions I’ve seen tend towards busting cache to force a read on a master record. This also applies to time driven filters, eg, using a ?since= param to get only the entries that have changed instead of the usual sliding window feed. These techniques impacts scalability and you want to be careful about baking that into a ‘standard’. That said the ECM space as I recall is full of requirements that will want minimal to zero latency between a content write and an updated search index, including technical options such as not using a search index at all and hitting a database directly.

POST for search can also be used to support a UI aesthetic preference, eg the way search works on pragamticprogrammers.com:

POST /search HTTP/1.1
Host: www.pragprog.com
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092510 Ubuntu/8.04 (hardy) Firefox/3.0.3
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en,fr;q=0.7,en-us;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://www.pragprog.com/
Content-Type: application/x-www-form-urlencoded
Content-Length: 6
q=book

HTTP/1.x 200 OK
Server: nginx/0.5.10
Date: Thu, 02 Oct 2008 20:32:35 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
Status: 200 OK
Cache-Control: no-cache
Content-Encoding: gzip

seems like a waste of an nginx server, but there you go.

Posted by Bill de h&#211;ra at

I don’t agree with this at all.

Roy Fielding: there is a trade-off between GET and POST that usually involves the size of the parameter content

I personally think I’m in good company.

Posted by Sam Ruby at

The tradeoffs between GET and POST for a query tend to revolve around whether the expected parameter size can reasonably be encoded in a URI [...]

I don’t agree with this at all.

I believe Sam is talking about the inherent limits to the size of how much data you can put into a URL.

I believe the “official” limit if that a HTTP GET request must be less than 5k in size... but in practice, most HTTP clients cannot handle a URL more than 2000 characters.

Posted by bex at

Thanks to Sam, Bill and Julian for their thoughtful comments.

I guess I agree with a lot of the comments and would probably distill this
discussion to that there are advantages for using GET on many use cases, but there
are also use cases where POST (or as Julian mentions SEARCH or REPORT) is more
more appropriate, which I would fully agree with.

Given the conversation and the fact that nobody argued the use cases
and benefits of GET, I would probably second Bill’s caution about baking
“the more exceptional case” into a ‘standard’, as the only way to execute
a query.

Thanks again.

Posted by David Nuescheler at

CMIS, APP, Zen-SOAP and WS-KitchenSink: some data points

The recent release of an early draft of a content management specification (CMIS, for Content Management Interoperability Services) provides an interesting perspective on not just SOAP-versus-REST but also Zen-SOAP versus WS-KitchenSink. I know...

Excerpt from William Vambenepe's blog at

De-hyping CMIS

This week has seen REST experts Roy Fielding and Sam Ruby comment on CMIS . As someone directly involved in CMIS, I wanted to acknowledge both Roy’s remarks and Sam’s remarks , which follow onto Roy’s. The standards effort based in OASIS that is...

Excerpt from Content Meltdown at

Add your comment