UserPreferences

RestEchoApiOneUriForEachEntry


Up: RestEchoApiDiscuss

One of the biggest differences between Tim Bray's proposal and the draft RFC is that Tim's assumes each Echo Entry has it's own URI. I didn't make that assumption in my RFC, and had to change the interface to accomodate that. I think this is a great point of discussion, and if we can get a concensus that each Echo Entry deserves it's own URI then I think the API can get even more elegant.

This brings up two distinct questions:

  1. Should each Echo Entry have it's own URI, one you can do a GET to retrieve the XML from?

  2. Should the RestEchoApi allow PUT/POST/DELETE on that URI as part of the API?

[TimBray] To the questions: yes and no. Yes, entries should have URIs. Why? Check the [WWW]Architecture of the Web draft: the single most important web-architectural principle is that things that matter should have URIs. No, there's no need to support HTTP verbs other than GET for the individual entry URIs. Why not? It's harder and I haven't yet seen any real practical advantages. Route the transactions through the URI of the publication.

[DannyAyers] On that last point - why is it harder? Won't the material at the publication URI have to be unmarshalled before processing and routed/marshalled again after, whereas per-item is direct??

[TimBray] Well, when I create a new post or update an old one, in most publishing systems I'm going to want to route that through my publishing-system logic, right? To build a search index or apply security rules or build a category/date directory or create an AvantGo version, all sorts of things. So since it's really the publishing system I'm asking to do the work, it seems like I ought to post to the publishing system. Put another way, the publishing system is going to want to intercept PUT and DELETE requests anyhow against individual entries, so why not just send them there? As to why harder, it's trivial to set up software to watch a single URI and receive chunks of XML posted to it and do something smart based on what it got; it's more work to configure that PUTs and DELETEs against all the URIs in your webspace all get routed to your software. Not impossible, but (a) trickier, (b) kind of decieving since it's really the pub sys doing the work and (c) doesn't really have any advantages that I can see.

[DaveWarnock RefactorOk] 1. Yes, that includes all types of entry (so it includes comments)

[AsbjornUlsberg, RefactorOk] What I seem to miss is why the publishing-system has to have one URI and can't handle all URI's on a web site. With mod_rewrite on Apache, it's rediculously easy to handle all requests and rewrite them into another, which the publishing system can receive. E.g. "http://www.example.com/cars/opel/123.html" can be translated in mod_rewrite to "http://www.example.com/publish.cgi?category=opel&article=123".

Also I'm worried about entries and other resources which can't be retrieved over HTTP. The reason for the unretrievability may be because of a firewall, because the resource was produced in an off-web environment (like e.g. MS Word), etc. Therefore I state that we cannot require Echo-resources to have URI's, but we MUST require them to have an globally unique ID, a'la a MessageID's of NNTP articles.

2. More complicated.

I thought PUT is normally suggested for new and replacement content (ie you PUT a document, because GET transfers the state to the client and then you put the new representation pack to the server). Purest REST seems to suggest PUT for new entry and edit entry, this does not fit well with a server allocated URI for new entrys (unless we add an extra step to first ask for the URI to be used, which does not seem sensible here). Therefore our options are

a) use PUT for edit and POST for new b) use PUT for new and edit, but before new use POST to ask for the URI c) use POST for new and edit Of these c) seems the simplest.

DELETE has less problems, it is clear what you want to do and the URI exists. The only problem is that apparently some tools don't support DELETE and many programmers are not used to using it. So we should use DELETE if we wish to be evangelists for REST with strong opinions about doing the right thing, otherwise for pragmatic reasons we should use POST to get going quicker.

Discuss

[TimBray] Timothy Appnel wrote me to point out that when you create an entry, all you need get back with the 201 Created code is the new URI. Lighter-weight simpler. If you want to get the whole thing, that's what GET is for. I suspect TimA will blog it.

[TimBray] For posting comments, why not use the same URI that's used for editing, just as you do for deleting. Then you can just do

<comment id="URI of entry being commented on">

You could have some optional trackback-like fields identifying the source of the comment.

[NickChalko RefactorOk] I agree because CommentsAreEntries.

[DannyAyers RefactorOk] But this doesn't look much like the syntax used elsewhere, i.e. CommentEntryExample. See related comments in RestEchoApiDiscuss on syntax reuse.

[DaveWarnock RefactorOk] We have consensus that CommentsAreEntries. Tim, you have pointed out above that every entry should have a URI and we post to it. Therefore when we post a comment, it is exactly the same as posting an entry. The 201 is for the comment URI. We post to the comment URI to edit or delete it. Providing other stuff like TrackBack are also implemented as entries then we have an absolutely simple and consistant API by taking what has been suggested for entries.

[MishaDynin RefactorOk] Blogger has distinct notions of posting (adding the post to the database), publishing (posting, rendering pages, and uploading them), and syndication (posting and emailing them out). How do I capture this with REST?

[JustinErenkrantz RefactorOk] I disagree with TimBray's comment that it is harder to have a single URI receive chunks of XML posted to it than to issue DELETE directly to the URIs of the entries themselves. In an Apache httpd-centric implementation, you can use a handler/script that relies on r->path_info, which is the 'unprocessed' components of the URI. (Most other HTTP servers have similar mechanisms.) Apache HTTP Server's mod_mbox uses this technique to its advantage - it registers the 'root' location and then different methods could be handled or virtual locations served. Therefore, I don't believe this is a significant challenge and such a setup would allow clean matching with the DELETE semantics of HTTP.

[TomasJogin, RefactorOk] What if the software that you use to manage your weblog is constrained to a cgi-bin directory only? I mean, is this a far-fetched not-gonna-happen scenario? One which shouldn't be taken into consideration in the making of this specification? Because, if your hosting provider restricts scripts to certain directories, you are going to have to route publishing operations through a couple of specific scripts in a specific location, not just any URL on the weblog. Furthermore, not all entries have a distinct URL, either. Some are just a-names on a daily, weekly or monthly archive (/weblog/archives/20030707.html#entry14). With that kind of setup, too, you have to route publishing operations through certain scripts, not just send PUT, POST, DELETE or whatever to the permalink URL in question.

[AsbjornUlsberg, RefactorOk] The situation today may be that each feed-resource (entry, content etc) doesn't have distinct URI's attached to them, but I think this can't continue. I can live with <content> not having an URI (except <content> with "src" of course), but <entry> should be retrievable.

As I've stated earlier, not all Echo feeds may be retrievable over HTTP. We may decide that this is unacceptable behavior, but as we haven't yet, we have to concider it. In such cases, each resource should at least have a globally unique ID, so they can be identified by other systems. But if a feed is retrievable over HTTP, al sub-entries of that feed should also be. And the retrievable URI should of course point to a Echo XML version of the resource, not an HTML version. If the resource has alternative views (as HTML is), this should somehow be noted in the Echo XML.

[JasonHx] +1 on entries having URIs. Also, to increase wikibility, I'd like to encourage '/' as the virtual file system path indicator in an echo-wiki-space URI and '#' to anchor named links inside a given entry's <content> payload to maintain the dimensions of expression a la HTML and more importantly for API design, to allow for XPath-like functional axes across entries. See Containers for more on this.

A recent demonstration of a Get / Post / Put / Delete model is Syncato.

[AsbjornUlsberg] +1, Jason.


CategoryArchitecture, CategoryModel, CategoryApi, CategoryRest