It’s just data

All the way to SOAP

Apparently, the Monastic one (a.k.a., my personal albatross) seems some value in a data binding package, and he correctly points out that validation may not be necessary part of the equation.  I'll take what I can get.

Meanwhile he states surprise, not once but twice, that I would be willing to go all the way to web services and all the way to SOAP in order to achieve this.

So, I felt it instructive to explore exactly how far that is.  This resulted in a new essay, entitled, Soap By Example.  Enjoy.


Sam,
I don't get it. So you show that a SOAP application can be written using an HTTP module and the XML DOM. So what? I could similarly show you could write the application using the C string functions and POSIX sockets. What does that prove?

PS: Your gift is in the mail. :)

Posted by Dare Obasanjo at

It's confusing being an albatross.

I just don't see what you gain with your soap:Envelope and soap:Body elements.

I think you're answering questions I'm not asking. SOAP is XML - sure, duh. What I'm asking is what you're gaining with the extra SOAP dreck. What I'm seeing is nothing.

Probably time to find a different neck to hang around.

Posted by Simon St.Laurent at

Sorry for the quick second post, but it just struck me that instead of asking people to add soap:Body and soap:Envelope elements to their documents, maybe SOAP processors should just accept messages which happen to be missing those elements.

In this case, the gs:doGoogleSearch element is effectively all the info you need to process.

I can't be the first to suggest that. Guess I now have a reason to re-read the ever-delightful SOAP specs.

Posted by Simon St.Laurent at

Simon - believe it or not, we are making progress. My point was to identify exactly what "all the way to x" meant. Apparently it means two elements.

Modulo backwards compatibility problems, and the time it takes to come to consensus and wide deployment, I have no philosophical problem with a hypothetical future version of a spec that indicates that a headerless element is to be treated as equivalent to one with empty headers. Personally, I think that the issue of not treating URI's as first class concepts is a more important problem to tackle than two elements, but that's just me.

Now that I've had some fun at your expense, I'm willing to make my first truly controversial statement. IMHO, The reason for these elements is to allow extensible headers to be added. What can such headers be used for? For starters, they can address cases where it is not practical to punt. Joe happens to be running on a server where he has access to his .htaccess file. Not everyone is so lucky.

Posted by Sam Ruby at

One of my pet peeves is simple examples that don't encourage fundamental good behaviour. Although less than, greater than and ampersand aren't likely to appear in a Google query, I'd feel all warm and fuzzy inside if your do_search() function used something like:

template % (xml.sax.saxutils.escape(key), xml.sax.saxutils.escape(q))

instead of template % (key, q) to ensure that the query remains well-formed XML.



"I just don't see what you gain with your soap:Envelope and soap:Body elements."

You get a response.

Without them, a SOAP service - that is written by somebody over at Company X which provides a service you're interested in - will not respond to your query.

I recently found myself in this situation. In my humble opinion, it would have been a simpler and better interface if the extra SOAP elements didn't get in the way, particularly on the request side, where a command like "give me the metadata for object 41" or "send me objects 12 and 13" would have made for a very simple GET request. But I didn't get to make these decisions; I just have to make our piece of the client's puzzle talk to theirs.

Posted by Daryl Oidy at

That's fun, eh? Weird.

"The reason for these elements is to allow extensible headers to be added. What can such headers be used for? For starters, they can address cases where it is not practical to punt. Joe happens to be running on a server where he has access to his .htaccess file. Not everyone is so lucky. "

Fascinating stuff. Not. I guess maybe it hasn't occurred to you that XML itself is plenty extensible, and that the kind of extensibility you're proposing to add is exactly the kind of stuff that gives SOAP a lousy name with folks who value the Web more than Web Services.

If you want SOAP's odd notion of extensible and are happy to glue blank envelopes to other people's messages to get it, that's fine for you. Just keep in mind that a lot of people won't fall for that - we don't need it, want it, or like it.

A lot of us do in fact have access to .htaccess, and I hate to imagine someone building a serious services site who didn't. As a web server admin, I'd be a lot more cautious about giving my users access to SOAP processing than to .htaccess.

Thanks for the fun, but I have to get back to the holidays. If you have a chance to study XML's capabilities beyond the boundaries of SOAP, maybe we can talk sometime.

Posted by Simon St.Laurent at

Daryl: OK, I'm game. I've made the fix. I also changed it so that the template is filled in using named parameter association, not positional.

Simon, my guess is that if Joe was ever on a machine where he could not modify his .htaccess files, he would probably want some place else to put this information than the element itself. In any case, the only "SOAP processing" we are talking about here is the ability to get access to the XML document which was sent.

All I've heard so far that you don't want or like is the two elements. In this case, my attitude is like Daryl's - if two elements is the cost of consensus, then I'm happy to focus my energies on other issues.

Posted by Sam Ruby at

Oh I see, it's another vi vs. Emacs *cough* I mean REST vs. SOAP argument.

Aren't these passe yet?

:)

Posted by Dare Obasanjo at

Sam, I think it's time to focus your energy on other issues.

You've stripped SOAP down to the point where it masquerades as REST, for the mere price of "two elements". Then you try to lead people down the SOAP path by promising them SOAP-style extensibility using those two elements, effectively leaving REST behind.

SOAP and REST are two separate paths once you get past trivial examples. Pretending that this is just about two elements is a nice public relations effort, but not very constructive. If I wanted a SOAP response, I'd have asked for one. I don't want one.

Dare's right - this is passe. Let's see if there's anything new in 2003.

Posted by Simon St.Laurent at

It is meant to be a REST + SOAP discussion, not a REST vs SOAP discussion.

REST is an architectural style. Moving header information after the blank line in HTTP has technical implications (some pleasant, some not), but doesn't violate the architecture.

Posted by Sam Ruby at

I was worried when you picked a case in RESTLog that was too close to SOAP. Which is why I keep coming back to my questions.

Sam - "Simon, my guess is that if Joe was ever on a machine where he could not modify his .htaccess files, he would probably want some place else to put this information than the element itself."

I have to agree with Simon who said, "...and I hate to imagine someone building a serious services site who didn't."

Either security info comes across in the document or the HTTP headers. Just because SOAP puts it in an element called 'header' doesn't mean it's not in the document.

And even in the extreme case you present (no access to a .htaccess file) if I have access to a scripting facility can't I process those headers myself?
Sure I wouldn't want to re-implement Basic and Digest authentication myself in a script, but the point is that even in the situation you setup (lack of .htaccess) you won't be locked out of the RESTLog API.

Can we move on to the other cases? And when we finish with those I was thinking of adding an interface to RESTLog for uploading pictures... :)


Posted by joe at

Joe, I'm getting there. ;-)

My next step is a question or two for you to ponder. Oh, and a few more examples. But first I would like to finish this thread.

Joe, can you tell me what the meta tag is that appears in http://radio.weblogs.com/0103451/ ? I also use it in http://radio.weblogs.com/0101679/ . Will its use in either of these cases eventually lead the RESTful Web to impload?

Posted by Sam Ruby at

Certainly it is quite entertaining to watch you all tussle but leads me to wonder if this is why the whole XML and standards area is such a mess: people in constant disagreement.

Posted by Sam Gentile at

Sam, that reminds me of a joke.

Sure XML has problems. So does Microsoft.

If those are problems, I wish I had more.

Posted by Sam Ruby at

Heheh, why yes HTML does have a whole 'head' element for meta data.

Would you add security info to HTML HEAD?

Maybe HTML could be made more extensible by wrapping *it* in a SOAP envelope. Or SOAP could be made more extensible by wrapping *it* in an HTML envelope. Of course do we put the SOAP or the HTML as the outer envelope? This is the second time this problem has come up. The "Multiple Envelope" problem.

What about other types of documents returned from servers and presented in browsers? Should SVG have been given a SOAP envelope? How about GIFs, PNGs, Flash, and Acrobat files?

In the more specific case of XML formats, you already wrote a fine essay about how to extend such formats, are you advocating not using namespaces anymore?

Posted by joe at

The body of a createNews request is RSS. It is extensible via Namespaces. If I wanted to put something in there that you are welcome to ignore, but is not meant to be part of that message, I need someplace else to put it. Just like a person can love all their children, it is possible to like both namespaces and headers.

FYI: SOAPHeaders have two features worth mentioning: a mustUnderstand attribute can be used to indicate whether or not something is optional, and an actor attribute which can indicate who this header is indended for.

Posted by Sam Ruby at

Not gonna let you get off the hook that easily ;)

Would you add security info to HTML HEAD?

"If I wanted to put something in there that you are welcome to ignore, but is not meant to be part of that message, I need someplace else to put it."

That is a rather vague statement. We were talking about the very concrete case of 'createNews', that is POSTing an RSS 'item' fragment to the URL 'RESTLog.cgi' to create a new news item. Can you give a concrete example of the out-of-band information did you want to attach?

Posted by joe at

I tend to think of credentials as something that one puts on request, and HTML as something you get as a response, but if you consider a digital signature close enough, then yes, I do see the possible value in placing a signature of the HTML body in an HTML header.

An example of out of band information is where you had to punt. If 95% of your servers are OK with basic, digest or other authentication, then not requiring other headers is goodness. Precluding more and/or different authentication schemes is another matter.

Posted by Sam Ruby at

Ok, I wasn't precise enough in my question. I will try again.

Would you add _credentials_ to HTML HEAD?

If the answer is no, why is it okay then to put _credentials_ in Envelope Header?



Posted by joe at

Yes.

Posted by Sam Ruby at

Well at least you are consistent. Insane, but consistent. Putting credentials in HTML HEAD is just wrong.

"Precluding more and/or different authentication schemes is another matter."

So we're stuck with just Basic and Digest authentication in HTTP and no new schemes can be added?

Or are you saying that new schemes *should* be done in HTML HEAD elements?

Posted by joe at

OK, so I'm a little loopy. But the question you have to ask yourself: exactly which HTTP headers is it sane to put in meta elements, and which ones is it not. And who decides?

Again, HTTP is primarily responses. And I can easily imagine wanting to digitally sign a HTTP page. Since that is a computationally expensive operation, I might want to do it once, place the resulting file on the file system, and let the OS and/or web server cache it, as well as any client or proxy.

Sure, one can define new HTTP authentication schemes. It is also possible to move this logic out to the endpoints, like Groove over SOAP apparently does.

Anyway, HTML is a little off the topic I wanted to explore, and part II of Soap by Example is nearly ready...

Posted by Sam Ruby at

"But the question you have to ask yourself: exactly which HTTP headers is it sane to put in meta elements, and which ones is it not. And who decides?"

Well a good rule of thumb would be if the information could applied to multiple formats. In particular to your example of digitally signing documents; wouldn't you also want to sign images, maybe TIFFs of legal documents?

As an example of this I am adding gzip compression of the RSS file to RESTLog this evening.

Posted by joe at

Joe, is there a reason that you feel that gzip compression can't handle digital signatures?

A different rule of thumb is that HTTP headers should only be used to convey information that is only valid during the duration of the Transfer (the second T in HTTP) process.

Another approach is to sign the image itself (most image formats provide some mechanism for this), so that the signature survives the transfer, i.e., can be received, e-mailed and detached with the signature intact.

But these are just rules of thumb. HTTP preaches that there are exactly two types of data, which are separated by a vast chasm called a blank line. Take the blue pill if you like and you can return to your blissful life in this world.

Posted by Sam Ruby at

I am so incredibly lost.

Posted by Mark at

Reading backwards I think we've drifted a little. Let me try to re-cap where *I* think we are:

Sam picked an example really close to standard SOAP when he asked me to SOAPify 'createNews' by allowing the RSS 'item' to be optionally wrapped in a SOAP Envelope element.

The rest of the discussion (which has nothing to do with REST btw) has been about what value that SOAP Envelope adds. Which in retrospect is odd that we drifted here since Sam clearly stated, "Modulo backwards compatibility problems, and the time it takes to come to consensus and wide deployment, I have no philosophical problem with a hypothetical future version of a spec that indicates that a headerless element is to be treated as equivalent to one with empty headers."

But never the less (and all the more) we relentlessly drifted on to the discussion of the value of the SOAP Envelope.

Simon states his opinion pretty clearly when he called it dreck.

I have three problems with the SOAP Envelope:
1. The Multiple Envelope problem.
2. Whatever 'extensibility' it offers is only available to XML documents and not other formats such as PNGs, Acrobat files, etc.
3. It just doesn't add any value. Any meta data you have should either come across in one of two places:
A. The HTTP headers - For info such as Content-Length, or Content-Type.
B. In the document itself - Digital signature of an Acrobat file, or the stuff we put in 'meta' and 'link' elements in an HTML file.

Now there are some cases where stuff isn't in the best place. For example we use the 'link' element to point to an 'alternate' version of a resource. It would probably be better if there were a way to query any URL for the different types of content it could return (maybe by using OPTIONS verb?). But I haven't seen a compelling argument that any of these problems requires the creation of a *third* place to store meta data, ala the SOAP Envelope Header.

Posted by joe at

Good summary, Joe. We may end up agreeing to disagree on point #2, and I'll argue that there is a 3C.

Here's my take:

1) this is a valid, albeit a bit theoretical, problem. Infinite regress rarely occurs in practice.

2) Non XML documents are done as MIME or DIME attachments. Want to package one of these up with some metadata? If so, then a SOAP header is the perfect place for such info.

3) A and B are valid for many cases and, when appropriate, reduce or eliminate the need for SOAP headers. The clearest value of SOAP headers is when you need to send a message through a gauntlet of gateways, and you need to target some specific metadata at one or more of these intermediate nodes. In SOAPSpeak, these nodes are called actors.

If you rewind all the way back to the start of this thread, I had suggested that the blogger API parameters should be split into two parts... the first was info targetting the content managment system itself, and the second is info that is intended for whomever accesses that content.

In cases where there is exactly one intermediary (e.g. in RESTLog the distinction between the webserver and the CMS is intentionally blurred), HTTP headers not only suffice, they are ideally suited to the task.

Posted by Sam Ruby at

Cool. I think we are getting to some really core stuff.

2) "Non XML documents are done as MIME
or DIME attachments."

MIME/DIME attachments don't make much
sense to me, you are already using
HTTP which is perfectly capable of
transporting such content, why
ignore it?

3) "The clearest value of SOAP headers
is when you need to send a message
through a gauntlet of gateways, and
you need to target some specific
metadata at one or more of these
intermediate nodes."

On this point I will disagree. You aren't supposed to talk to those intermediate nodes. You should never even know that such intermediate nodes exist. If you make a request to a URL you should only know or care about it's response, and not how it got that response. If I query Google I don't care if the results are divined by 10,000 linux boxes, randomly picked by squads of trained canadian geese, or signalled from the otherworld via interpretive dance. I should only care about the results.

"In cases where there is exactly one intermediary (e.g. in RESTLog the distinction between the webserver and the CMS is intentionally blurred), HTTP headers not only suffice, they are ideally suited to the task."

This I agree with. The only difference is that I think there should *never* be more than one intermediary.

Posted by joe at

Joe, never is such a strong word.

Care to explain the meaning of AggieConfig/Proxy?

Abstractions Leak.

Posted by Sam Ruby at

It is just that, a leak in the abstraction. And an unwarranted one at that. The networks at all my employers and even my network of three computers at home don't require proxy settings to connect to the internet.

For the sake of argument let's assume there is a proxy server and it is visible, are you suggesting that it would be ok to add 'proxyuser' and 'proxypassword' elements to the RESTLog Archive Format?

Posted by joe at

I'll bet that those networks are asymmetric. Question: does the network at your work or the network at your home permit connections from the internet to you? Most P2P systems (from napster to IM to Groove) have alternate abstractions to deal with addressing mobile users.

My cell phone has a feature where you can send it a message via the web, and even track the message.

Every time you hop from one addressing scheme to another, you potentially have the need for authentication and/or authorization. And an intemediary - possibly only a logical one, but a conceptual one nevertheless.

I can provide more examples, but the question isn't the number of examples that can be generated, but whether the statement that there should *never* be more than one intermediary is tenable, or merely a local optimization valid for a range of use cases for one application.

SOAP doesn't require headers in all messages. The question on the table is whether they should be precluded.

RLAF looks user and connection independent. Any such information (if appropriate to this application, at all) should be modeled and persisted elsewhere.

Posted by Sam Ruby at

I am not fit to loosen Joe's sandles WRT to the core issues discussed above. However, there is one small and completely irrelevant clarification I'd like to make:

It's "Canada Geese", not "Canadian".

Carry on. :)

Posted by Dan Isaacs at

Pingback from Sam Ruby: Universal Personal Proxies

at

Add your comment