PaceSimpleResourcePosting

Abstract

Summary: A client needing to upload a compound document uses an Atom endpoint to POST each piece of binary content to an Atom server. The endpoint returns the address of the uploaded content as a resource URI. The client then uses the resource URI string whenever it needs to refer to the resource within an Atom entry it is composing. It then uses the standard Atom posting endpoint to upload the entry, which then refers indirectly to the uploaded binary content. Subsequent editing of the binary resources is then possible using standard HTTP methods.

This proposal extends the AtomAPI to allow for a new creation URI, ResourcePostURI, to be used for simple, efficient uploading of resources referenced by a separate Atom entry. It also allows an Atom server to provide for updating (via PUT) and deleting (via DELETE) of a previously uploaded resource.

This proposal is an alternative to PaceObjectModule, PaceDontSyndicate, and PaceResource. It is very similar to PaceNonEntryResources but differs in various details (TBD: list these, reconcile the two). It is is compatible with WebDAV but does not require that a server support WebDAV.

The proposal depends on a separate change to the Atom Syndication Format to allow an atom:entry to refer to its content indirectly rather than inline between <content>...</content>. Two proposals both address this: PaceSimpleContentType, which adds a <link rel="content"> to atom:entry, and PaceContentSrc, which adds a @src attribute to atom:content. Either would work.

Status

Open

Rationale

We want to be able to efficiently upload large binary resources either as the content of entries or as sub-parts of entries. Use cases include a photo blog where each entry is a photo (with Atom metadata), or an entry containing one or more pictures to be posted as a compound document (see http://www.imc.org/atom-syntax/mail-archive/msg04330.html). We also want to be able to update or delete resources which were previously uploaded, without necessarily requiring WebDAV support.

Uploading large binary pieces of content inside an Atom XML document is possible via appropriate encoding (see PaceObjectModule, PaceDontSyndicate, PaceResource). However, this introduces additional overhead (33% in the case of base64 encoding) which may be an important issue in contexts such as moblogging.

Using a mime multipart/related message via HTTP would avoid the encoding overhead and allow posting of an entry and multiple associated resources, such as images, in a single HTTP transaction (see http://www.intertwingly.net/blog/2004/05/07/Atom-MIME for an example of this). This has not been proposed as a Pace, however.

This proposal addresses the efficient creation of compound document entries, including as a simple case the creation of entries whose content consists entirely of an image or other binary resource. It addresses the subsequent updating and deletion of the resources by allowing Atom servers to advertise support for PUT and DELETE. It is not intended to address the question of including binary content directly in feeds.

Proposal

A new endpoint, ResourcePostURI, is added to Section 5, Functional Specification of the Atom API document in a new sub-section. In addition, either part (2) of PaceSimpleContentType is adopted (link rel="content"), or as an alternative the addition to the Atom Format given below.

The result of using the ResourcePostURI is a Location: header containing a "resource URI". An Atom server SHOULD support HEAD, PUT and DELETE methods on resource URIs returned in this manner. These methods operate to query, update or delete the given resource, per the HTTP 1.1 specification.

To support capability discovery, servers SHOULD support the OPTIONS method on the resource URI and return the list of other allowed methods in the Allow: header. Further, PUT or DELETE MUST each return an appropriate HTTP status if not implemented -- e.g., 405. (Not all CGI based servers can provide a valid Allow: header from OPTIONS. If PUT and DELETE are in Allow:, user agents SHOULD assume they are in fact available for the URI. If not, user agents MAY assume they are not available. User agents MAY alternatively attempt PUT or DELETE to the URI and check the result status.) (Note that a workaround for lack of PUT support is to create a new resource via ResourcePostURI and update the relevant link to point to the new resource.)

Atom servers MAY optionally also support some profile of WebDAV for these resources. The above capability discovery and usage is upwardly compatible with WebDAV, so clients MAY elect to support either or both.

(TODO: I can't figure out what section of the Atom API Specification the above should go into; it's not really part of the Atom API as it stands today, and it's certainly not part of the feed format. Suggestions?)

Additions to Atom API

(See Atom API Specification.)

5.x ResourcePostURI

The ResourcePostURI is used to create new non-entry resources. The client POSTs a resource of the desired MIME type directly to this URI. If the request is successful then the server returns a new unique URI where a representation of the resource may be retrieved. The URI returned MUST be suitable for use in a subsequent HTTP GET and MUST return the resource data originally uploaded. The URI returned MAY also be usable for editing of the resource via PUT and DELETE, or via full WebDAV.

5.x.1 Locating

For creating a new non-entry resource, the link tag is used. Note that a link tag is used in both HTML and in the Atom format. A link tag of the following format points to the ResourcePostURI for a site. In HTML the link tags are always found in the head element, while in Atom they may appear as children of the Feed and entry elements.

(Note: There is an open discussion regarding the use of @title to distinguish links associated with different feeds; this proposal will follow whatever the consensus is on that point.)

5.x.2 Request

The request contains a resource, sent through a standard HTTP POST, e.g.:

POST /_do/exampleblog/post_resource HTTP/1.1
Host: www.example.com
Content-Type: image/jpeg
Content-Length: nnn

...raw bytes of image go here...

5.x.3 Response

The expected status codes from a POST are 201, 303, 400, and 500. 401, 404, and 410 are also possible.

5.x.3.1 201 Created

Response MUST include a Location: header with the URI of the created resource, i.e. the URI used to retrieve the resource representation in a subsequent HTTP GET. The returned resource URI SHOULD also support PUT and DELETE. The server SHOULD omit the content of the resource in the response, since it would be redundant to return it to the client. The server MAY return an ETag or other caching information in the response, and the client and any intermediate proxies MAY use the information returned in this response for normal caching.

5.x.3.2 303

Similar to 201 but no caching is allowed. Response MUST include a Location: header as in 5.x.3.1.

5.x.3.3 400

Indicates that the server believes that that data sent constitutes an invalid request. A short description of the error will appear on the status line itself. A longer description will appear in the body.

5.x.3.4 500

Indicates that the server detected an internal error on the server processing this request (such as an unhandled exception). A short description of the error will appear on the status line itself. A longer description will appear in the body.

Impacts

Clients uploading compound documents must be prepared to create or rewrite URIs within (X)HTML entry content.

Notes

See the Usage section below for details on how this would be used for uploading.

See http://www.imc.org/atom-syntax/mail-archive/msg04302.html and http://www.imc.org/atom-syntax/mail-archive/msg04346.html for the original draft proposal and discussion of the various issues. This proposal corresponds to Option #6 from the June meeting notes, with the addition of optional PUT and DELETE methods.

See http://www.imc.org/atom-syntax/mail-archive/msg05226.html for discussions of this proposal.

Note that servers need not return relative URIs in the Location: header; they may return arbitrary URIs if desired. That is, clients MUST NOT assume any relationship between the resource URIs and other Atom URIs. Of course many servers will put the resource URIs in a reasonable hierarchy next to the entries for the feed they're a part of. However, making this a requirement seems like an unnecessary restriction (see http://www.imc.org/atom-syntax/mail-archive/msg04532.html). It also would appear to make it impossible to implement an Atom server purely via CGI scripts. Note that the URI returned may be a CGI script reference that supports GET and HEAD, and possibly OPTIONS, PUT and DELETE as well. For example, a CGI based Atom server may return Location:

If a client has a hard requirement to control the URIs of the resources it uploads, WebDAV is probably the answer. This proposal is meant to deal with the other end of the spectrum, where the client does not really care what the URIs of non-top-level resources are and just wants to move the bits from point A to point B with as little fuss as possible.

This proposal may be similar to an existing(?) TypePad Atom extension, UploadURI, but it's not clear whether the details are the same.

See http://imc.org/atom-syntax/mail-archive/msg05443.html for a discussion of why PUT and DELETE are 'SHOULD' rather than 'MUST' support requirements in PaceSimpleResourcePosting.

Usage

When a client wishes to post a large binary image as the content of an entry, it does the following:

HTTP POST the image data to the ResourcePostURI, retrieving the Location: URI from the response;
Set the "src" attribute of the "content" element to the retrieved image URI if using PaceContentSrc, or the "href" attribute of the "link" element if following PaceSimpleContentType;
HTTP POST the Atom entry to the PostURI.

When a client wishes to post a compound document consisting of XHTML content with one or more embedded images, it does the following:

HTTP POST each image in turn to the ResourcePostURI, remembering the Location: response URI for each one;
Set the "src" attribute of each "img" element in the XHTML content about to be sent to the appropriate URI;
HTTP POST the Atom entry to the PostURI.

Exceptions: If there are problems uploading image #5 of 6, images 1-4 are safely uploaded and need not be retried. Conversely, if the client dies in the middle of a conversation, some resources may be uploaded while others are not; this proposal does not attempt to deal with this problem, and assumes that servers need to have backstops in place to allow cleanup of such unreferenced resources.

When a client wishes to update a previously uploaded binary image which is the content of an entry, it does the following:

Get the Atom entry containing the resource URI of the binary image and parse it out.
Do an HTTP GET of the data at the resource URI, if desired.
Edit the data locally.
Do an HTTP PUT of the data to the resource URI.

Exceptions: If the Atom server does not support PUT for the URI, it omits PUT from the Allow: header returned in step 2, and the client either warns the user that the resource isn't editable or it works around the problem by uploading the resource a second time at a new URI.

Extensions

In some cases, it may be useful for a client to provide a hint to a server as to the name of the resource to be uploaded. As an extension to this proposal, we may repurpose the Content-Location: header for this use. This would be added to the 5.x.1 section above:

In addition, a client MAY provide a Content-Location: header with the POST request to the ResourcePostURI. The server MAY use the contents of this header in constructing the URI returned in the response Location:. For example, Content-Location: mycat.jpg is a hint to the server that it may want to use the leaf name mycat.jpg in the URI it returns.