Apparently, the Monastic
one (a.k.a., my personal albatross) seems
some value in a data binding package, and he correctly points out
that validation may not be necessary part of the
equation. I'll take what I can get.
So, I felt it instructive to explore exactly how far that
is. This resulted in a new essay, entitled, Soap
By Example. Enjoy.
Sam, I don't get it. So you show that a SOAP application can be written using an HTTP module and the XML DOM. So what? I could similarly show you could write the application using the C string functions and POSIX sockets. What does that prove?
Sorry for the quick second post, but it just struck me that instead of asking people to add soap:Body and soap:Envelope elements to their documents, maybe SOAP processors should just accept messages which happen to be missing those elements.
In this case, the gs:doGoogleSearch element is effectively all the info you need to process.
I can't be the first to suggest that. Guess I now have a reason to re-read the ever-delightful SOAP specs.
Simon - believe it or not, we are making progress. My point was to identify exactly what "all the way to x" meant. Apparently it means two elements.
Modulo backwards compatibility problems, and the time it takes to come to consensus and wide deployment, I have no philosophical problem with a hypothetical future version of a spec that indicates that a headerless element is to be treated as equivalent to one with empty headers. Personally, I think that the issue of not treating URI's as first class concepts is a more important problem to tackle than two elements, but that's just me.
Now that I've had some fun at your expense, I'm willing to make my first truly controversial statement. IMHO, The reason for these elements is to allow extensible headers to be added. What can such headers be used for? For starters, they can address cases where it is not practical to punt. Joe happens to be running on a server where he has access to his .htaccess file. Not everyone is so lucky.
One of my pet peeves is simple examples that don't encourage fundamental good behaviour. Although less than, greater than and ampersand aren't likely to appear in a Google query, I'd feel all warm and fuzzy inside if your do_search() function used something like:
instead of template % (key, q) to ensure that the query remains well-formed XML.
"I just don't see what you gain with your soap:Envelope and soap:Body elements."
You get a response.
Without them, a SOAP service - that is written by somebody over at Company X which provides a service you're interested in - will not respond to your query.
I recently found myself in this situation. In my humble opinion, it would have been a simpler and better interface if the extra SOAP elements didn't get in the way, particularly on the request side, where a command like "give me the metadata for object 41" or "send me objects 12 and 13" would have made for a very simple GET request. But I didn't get to make these decisions; I just have to make our piece of the client's puzzle talk to theirs.
Posted by Daryl Oidy at
That's fun, eh? Weird.
"The reason for these elements is to allow extensible headers to be added. What can such headers be used for? For starters, they can address cases where it is not practical to punt. Joe happens to be running on a server where he has access to his .htaccess file. Not everyone is so lucky. "
Fascinating stuff. Not. I guess maybe it hasn't occurred to you that XML itself is plenty extensible, and that the kind of extensibility you're proposing to add is exactly the kind of stuff that gives SOAP a lousy name with folks who value the Web more than Web Services.
If you want SOAP's odd notion of extensible and are happy to glue blank envelopes to other people's messages to get it, that's fine for you. Just keep in mind that a lot of people won't fall for that - we don't need it, want it, or like it.
A lot of us do in fact have access to .htaccess, and I hate to imagine someone building a serious services site who didn't. As a web server admin, I'd be a lot more cautious about giving my users access to SOAP processing than to .htaccess.
Thanks for the fun, but I have to get back to the holidays. If you have a chance to study XML's capabilities beyond the boundaries of SOAP, maybe we can talk sometime.
Daryl: OK, I'm game. I've made the fix. I also changed it so that the template is filled in using named parameter association, not positional.
Simon, my guess is that if Joe was ever on a machine where he could not modify his .htaccess files, he would probably want some place else to put this information than the element itself. In any case, the only "SOAP processing" we are talking about here is the ability to get access to the XML document which was sent.
All I've heard so far that you don't want or like is the two elements. In this case, my attitude is like Daryl's - if two elements is the cost of consensus, then I'm happy to focus my energies on other issues.
Sam, I think it's time to focus your energy on other issues.
You've stripped SOAP down to the point where it masquerades as REST, for the mere price of "two elements". Then you try to lead people down the SOAP path by promising them SOAP-style extensibility using those two elements, effectively leaving REST behind.
SOAP and REST are two separate paths once you get past trivial examples. Pretending that this is just about two elements is a nice public relations effort, but not very constructive. If I wanted a SOAP response, I'd have asked for one. I don't want one.
Dare's right - this is passe. Let's see if there's anything new in 2003.
I was worried when you picked a case in RESTLog that was too close to SOAP. Which is why I keep coming back to my questions.
Sam - "Simon, my guess is that if Joe was ever on a machine where he could not modify his .htaccess files, he would probably want some place else to put this information than the element itself."
I have to agree with Simon who said, "...and I hate to imagine someone building a serious services site who didn't."
Either security info comes across in the document or the HTTP headers. Just because SOAP puts it in an element called 'header' doesn't mean it's not in the document.
And even in the extreme case you present (no access to a .htaccess file) if I have access to a scripting facility can't I process those headers myself? Sure I wouldn't want to re-implement Basic and Digest authentication myself in a script, but the point is that even in the situation you setup (lack of .htaccess) you won't be locked out of the RESTLog API.
Can we move on to the other cases? And when we finish with those I was thinking of adding an interface to RESTLog for uploading pictures... :)
Heheh, why yes HTML does have a whole 'head' element for meta data.
Would you add security info to HTML HEAD?
Maybe HTML could be made more extensible by wrapping *it* in a SOAP envelope. Or SOAP could be made more extensible by wrapping *it* in an HTML envelope. Of course do we put the SOAP or the HTML as the outer envelope? This is the second time this problem has come up. The "Multiple Envelope" problem.
What about other types of documents returned from servers and presented in browsers? Should SVG have been given a SOAP envelope? How about GIFs, PNGs, Flash, and Acrobat files?
In the more specific case of XML formats, you already wrote a fine essay about how to extend such formats, are you advocating not using namespaces anymore?
The body of a createNews request is RSS. It is extensible via Namespaces. If I wanted to put something in there that you are welcome to ignore, but is not meant to be part of that message, I need someplace else to put it. Just like a person can love all their children, it is possible to like both namespaces and headers.
FYI: SOAPHeaders have two features worth mentioning: a mustUnderstand attribute can be used to indicate whether or not something is optional, and an actor attribute which can indicate who this header is indended for.
"If I wanted to put something in there that you are welcome to ignore, but is not meant to be part of that message, I need someplace else to put it."
That is a rather vague statement. We were talking about the very concrete case of 'createNews', that is POSTing an RSS 'item' fragment to the URL 'RESTLog.cgi' to create a new news item. Can you give a concrete example of the out-of-band information did you want to attach?
I tend to think of credentials as something that one puts on request, and HTML as something you get as a response, but if you consider a digital signature close enough, then yes, I do see the possible value in placing a signature of the HTML body in an HTML header.
An example of out of band information is where you had to punt. If 95% of your servers are OK with basic, digest or other authentication, then not requiring other headers is goodness. Precluding more and/or different authentication schemes is another matter.
OK, so I'm a little loopy. But the question you have to ask yourself: exactly which HTTP headers is it sane to put in meta elements, and which ones is it not. And who decides?
Again, HTTP is primarily responses. And I can easily imagine wanting to digitally sign a HTTP page. Since that is a computationally expensive operation, I might want to do it once, place the resulting file on the file system, and let the OS and/or web server cache it, as well as any client or proxy.
Sure, one can define new HTTP authentication schemes. It is also possible to move this logic out to the endpoints, like Groove over SOAP apparently does.
Anyway, HTML is a little off the topic I wanted to explore, and part II of Soap by Example is nearly ready...
"But the question you have to ask yourself: exactly which HTTP headers is it sane to put in meta elements, and which ones is it not. And who decides?"
Well a good rule of thumb would be if the information could applied to multiple formats. In particular to your example of digitally signing documents; wouldn't you also want to sign images, maybe TIFFs of legal documents?
As an example of this I am adding gzip compression of the RSS file to RESTLog this evening.
Joe, is there a reason that you feel that gzip compression can't handle digital signatures?
A different rule of thumb is that HTTP headers should only be used to convey information that is only valid during the duration of the Transfer (the second T in HTTP) process.
Another approach is to sign the image itself (most image formats provide some mechanism for this), so that the signature survives the transfer, i.e., can be received, e-mailed and detached with the signature intact.
But these are just rules of thumb. HTTP preaches that there are exactly two types of data, which are separated by a vast chasm called a blank line. Take the blue pill if you like and you can return to your blissful life in this world.
Reading backwards I think we've drifted a little. Let me try to re-cap where *I* think we are:
Sam picked an example really close to standard SOAP when he asked me to SOAPify 'createNews' by allowing the RSS 'item' to be optionally wrapped in a SOAP Envelope element.
The rest of the discussion (which has nothing to do with REST btw) has been about what value that SOAP Envelope adds. Which in retrospect is odd that we drifted here since Sam clearly stated, "Modulo backwards compatibility problems, and the time it takes to come to consensus and wide deployment, I have no philosophical problem with a hypothetical future version of a spec that indicates that a headerless element is to be treated as equivalent to one with empty headers."
But never the less (and all the more) we relentlessly drifted on to the discussion of the value of the SOAP Envelope.
Simon states his opinion pretty clearly when he called it dreck.
I have three problems with the SOAP Envelope: 1. The Multiple Envelope problem. 2. Whatever 'extensibility' it offers is only available to XML documents and not other formats such as PNGs, Acrobat files, etc. 3. It just doesn't add any value. Any meta data you have should either come across in one of two places: A. The HTTP headers - For info such as Content-Length, or Content-Type. B. In the document itself - Digital signature of an Acrobat file, or the stuff we put in 'meta' and 'link' elements in an HTML file.
Now there are some cases where stuff isn't in the best place. For example we use the 'link' element to point to an 'alternate' version of a resource. It would probably be better if there were a way to query any URL for the different types of content it could return (maybe by using OPTIONS verb?). But I haven't seen a compelling argument that any of these problems requires the creation of a *third* place to store meta data, ala the SOAP Envelope Header.
Good summary, Joe. We may end up agreeing to disagree on point #2, and I'll argue that there is a 3C.
Here's my take:
1) this is a valid, albeit a bit theoretical, problem. Infinite regress rarely occurs in practice.
2) Non XML documents are done as MIME or DIME attachments. Want to package one of these up with some metadata? If so, then a SOAP header is the perfect place for such info.
3) A and B are valid for many cases and, when appropriate, reduce or eliminate the need for SOAP headers. The clearest value of SOAP headers is when you need to send a message through a gauntlet of gateways, and you need to target some specific metadata at one or more of these intermediate nodes. In SOAPSpeak, these nodes are called actors.
If you rewind all the way back to the start of this thread, I had suggested that the blogger API parameters should be split into two parts... the first was info targetting the content managment system itself, and the second is info that is intended for whomever accesses that content.
In cases where there is exactly one intermediary (e.g. in RESTLog the distinction between the webserver and the CMS is intentionally blurred), HTTP headers not only suffice, they are ideally suited to the task.
Cool. I think we are getting to some really core stuff.
2) "Non XML documents are done as MIME or DIME attachments."
MIME/DIME attachments don't make much sense to me, you are already using HTTP which is perfectly capable of transporting such content, why ignore it?
3) "The clearest value of SOAP headers is when you need to send a message through a gauntlet of gateways, and you need to target some specific metadata at one or more of these intermediate nodes."
On this point I will disagree. You aren't supposed to talk to those intermediate nodes. You should never even know that such intermediate nodes exist. If you make a request to a URL you should only know or care about it's response, and not how it got that response. If I query Google I don't care if the results are divined by 10,000 linux boxes, randomly picked by squads of trained canadian geese, or signalled from the otherworld via interpretive dance. I should only care about the results.
"In cases where there is exactly one intermediary (e.g. in RESTLog the distinction between the webserver and the CMS is intentionally blurred), HTTP headers not only suffice, they are ideally suited to the task."
This I agree with. The only difference is that I think there should *never* be more than one intermediary.
It is just that, a leak in the abstraction. And an unwarranted one at that. The networks at all my employers and even my network of three computers at home don't require proxy settings to connect to the internet.
For the sake of argument let's assume there is a proxy server and it is visible, are you suggesting that it would be ok to add 'proxyuser' and 'proxypassword' elements to the RESTLog Archive Format?
I'll bet that those networks are asymmetric. Question: does the network at your work or the network at your home permit connections from the internet to you? Most P2P systems (from napster to IM to Groove) have alternate abstractions to deal with addressing mobile users.
My cell phone has a feature where you can send it a message via the web, and even track the message.
Every time you hop from one addressing scheme to another, you potentially have the need for authentication and/or authorization. And an intemediary - possibly only a logical one, but a conceptual one nevertheless.
I can provide more examples, but the question isn't the number of examples that can be generated, but whether the statement that there should *never* be more than one intermediary is tenable, or merely a local optimization valid for a range of use cases for one application.
SOAP doesn't require headers in all messages. The question on the table is whether they should be precluded.
RLAF looks user and connection independent. Any such information (if appropriate to this application, at all) should be modeled and persisted elsewhere.