To infinity and beyond - the quest for SOAP interoperability
By Sam Ruby, February 1, 2002.
This document makes the case that in addition to the SOAP standard, we need to document a set of best practices and a create classification of errors that identify potential interoperability issues.
We worked really hard to make things this easy
Unless you have been hiding under a rock for the last few years, web services is a term you have heard more than a few times before. At the core is the Simple Object Access Protocol (SOAP) which is a product of respected industry leaders such as DevelopMentor, International Business Machines Corporation, Lotus Development Corporation, Microsoft, and UserLand Software. SOAP has been accepted by the W3C and was published as a technical note nearly two years ago. Work on a 1.2 version is well underway.
The movement definitely has momentum. With several dozen implementations already out there, categories in Google and Yahoo, you would think that most of our problems are behind us?
Developers in this field know all too well that reality is not quite so simple. With all the excitement and progress, there is still much that needs to be done to make interoperability not just simple, but downright automatic. After all, that’s what this is all about, right? I mean, without interoperability, what is the point?
Before you go on, I must warn you that the following is not intended reading for people who don’t like seeing sausage made. It describes some of the nitty gritty details that have been going on behind the scenes to make interoperability the success that it is today. While I have generally chosen to draw on experiences I have had implementing Apache’s SOAP and then Axis, I do know where many of the skeletons are buried. ;-) Actually, that’s not really fair - having worked closely with a large number of soapbuilders I can honestly say that without exception that not only are these problems universal, the developers of the various stacks are generally very eager to address any problems that are brought to their attention.
Before Apache Axis, there was Apache SOAP. Before Apache SOAP, there was IBM's SOAP4J. It was quick to market, and by many accounts it was the most spec compliant implementation at the time. But as with most things, there were some short cuts taken in the process that later were noticed as bugs. I start this tale by describing one such teensy tiny little bug.
The way floating point numbers are represented in SOAP are as a familiar sets of decimal digits, with the possibility of exponential numbers represented in the traditional way that engineers and programmers have done for decades. The IBM SOAP4J implementation, being written in Java, used the toString method and string constructor to take care of the messy administrative details of conversion.
Life was good.
Or was it? Sure, every number expressible as a series of series of decimal numbers, with optional decimal and exponential parts, worked fine. But there were those pesky edge cases to worry about. One such edge case is the number infinity. It seems that the designers of Java spelled this quantity "Infinity", and the designers of XML Schema spelled this float quantity "INF".
Oops.
What this meant is that is that clients based on IBM's SOAP4J, while not spec compliant, were fully interoperable with servers based on IBM's SOAP4J. Even for infinity, as the technique used to "serialize" this data matched the one used to "deserialize" this data. But for this one edge case, IBM's implementation was not guaranteed to interoperate with any other SOAP implementation. Oops indeed!
Apache SOAP was changed to accept "INF" as a valid representation of the quantity known as infinity. It still produces "Infinity", but will accept either form now. This means that it is fully interoperable with IBM's SOAP4J, will accept a greater range of SOAP compliant messages and therefore inbound interoperate with a greater range of SOAP implementations, but still produces non-spec compliant messages in this case.
Apache Axis still accepts both forms, but now correctly produces the desired "INF". This sacrifices compatibility with the original IBM's SOAP 4J which it has long since superceded twice removed, but remains interoperable with it’s immediate predecessor (Apache SOAP) and now fully interoperates with a wide range of SOAP stacks when dealing with the quantity of infinity.
WOO HOO!
Again, it is worth pointing out that many, many other SOAP stacks have demonstrated this same problem. Some still do. Meanwhile IBM, MacroMedia, and other companies are aggressively incorporating the changes that have been made to the Apache implementation into their product lines.
Applying SOAP to your Erroneous Zones
The designers of the Ada programming language took compatibility and interoperability seriously. In fact they even went so far as to create a classification of errors. The one I wish to focus on for the moment is the category of erroneous execution. In a SOAP context, this would correspond to messages that do not conform to the specification. As with Ada, erroneous messages may produce useful results with specific versions of some implementations, but in general, are not portable or interoperable.
One such case is described above. Messages that purport to contain a floating-point quantity with a value of "Infinity" will be accepted by the Apache Axis implementation. Other implementations may fault upon receiving such a message. Both reflect implementation choices. Defensible implementation choices.
Accepting Infinity, and producing a message containing Infinity expecting it to be accepted are two different things. The first is being nice. The second is the result of an erroneous assumption. It may work for now with the configuration you have installed, but quite likely somebody, somewhere, either soon or sometime in the distant future may not like the message you sent. And you can't exactly blame them, as you aren’t exactly conforming to the spec.
Another such example is floating point overflow and underflow. These are attempts to represent as floating point quantities values that are outside of the range supported by the IEE 754 standard. The EasySoap++ client interoperability results contain a table with the results of "More Floating Point Tests". It is clear that the designers of this test expect that the servers should fault when they receive a message containing such values. But should they?
The same web page contains a link to the IEEE-754 References. On that web page, it lists the same value for +Infinity and Positive Overflow. But the EasySOAP interoperability tests seem to expect a fault in this case, and marks any server that does not match their expectations with a failure. Which is right?
My take is that such a test is itself erroneous. By that I mean that the test provides meaningful information insofar it helps to categorize the range of possibilities available in the current implementations, but that's about it. In particular, the results should not be categorized as a pass or a fail. And applications that expect to be interoperable should not expect a specific result from this type of message.
In fact, ideally, they should never generate this type of message at all.
Close enough for government work
Decimal numbers appear frequently in financial calculations. Per the XML Schema specification, decimals represent arbitrary precision decimal numbers.
Regretfully, the Apache implementation is based on the Java implementation of BigDecimal, so it is limited to supporting a few billions of digits. And I can't say I personally have ever verified anything remotely close to that limit.
Thankfully, the XML Schema specification indicates that a minimally conforming implementation must support at least 18 digits of precision. I guess Apache and Java then squeaks by.
Since an 18-digit limitation is specified for minimally conforming applications, then one can infer that depending on more than 18 digits of precision would be erroneous. That doesn't mean that SOAP implementation cannot or should not exceed this minimum. After all, sending arbitrary precision decimal numbers is permitted by the specification. So, if you have 75,638 digits of precision, the Apache implementations of SOAP will preserve every last bit. While that seems desirable, it does have the down side that it leaves to the application the problem of determining if any loss of precision is significant.
By contrast, the Microsoft .Net implementations are limited to a mere 29 significant digits of precision. (Only exceeding the specifications by a factor of 100 billion or so, sheesh). Seriously, the choice of the number of digits supported is a tradeoff of a number of factors including performance and customer requirements, and different vendors are free to make this tradeoff in different ways.
When all is said and done, this means that Apache and ASP.Net can reliably exchange decimal quantities to 29 digits of precision. All in all, that's not bad given the differences in platform implementation.
Careful readers will have noticed something subtle occurred in this section. Previously, it was only invalid messages that were erroneous. Now we are talking about perfectly valid, and arguably useful, messages that are erroneous. Scary thought.
Does anybody really know what time it is?
The dateTime data type in XML Schema contains centuries, years, months, days, hours, minutes, seconds "with any number of digits after the decimal is supported". There we go again. But this time, there is **GASP** no specification as to the minimum number of digits of digits that an application must support.
The Apache implementation maps dateTime to instances of the java.util.Date class. This class supports precisions up to the nearest millisecond. When was the last time you needed to record a date to the nearest millisecond? Well now you can!
Digits specified beyond a millisecond will be validated but not retained.
By contrast, the .Net implementation will retain dates to 100-nanosecond units - 4 extra digits of precision. I'm sure that comes in handy.
Again, we have complete interop to the limitations of the platforms, with messages that exceed those boundaries being correctly parsed but not processed to the specified level of precision. Looking at other implementations, some seem to stop at the unit of a second, discarding all fractions.
Recap: first, we had erroneous messages containing well formed but not semantically valid data. Then we had erroneous messages with semantically correct data. Now we have valid messages for which we have no way of determining whether they are erroneous or not. Can it get any worse than this?
Returning to decimal, the specification states that trailing zeros are optional. In fact, in the canonical form, trailing zeros are prohibited. However one reads these two statements, one clearly comes away with the impression that trailing zeros are not intended to be semantically significant. In my opinion, this is rather unfortunate as it encodes into the specification a behavior for which there isn't complete consensus in the industry. Those that have studied decimal values strongly believe that precision is very significant. Many references can be found here.
Java follows these other standards, and therefore in the native data types used for decimals, precision is significant. What's a Java based implementation to do then when dealing with decimal quantities via SOAP?
For starters, there is nothing in the spec that says that every message must use the canonical form of all data. So trailing zeros in that case are merely optional, which is another way of saying that they are permitted. And if the Java implementation retains these optional digits, what's the harm?
Note that this leaves it up to the application whether or not to enforce the additional SOAP semantics. But we already were in that boat when we decided to support beyond the specified minimum number of digits.
What makes this case so disquieting is that we have a case of an implementation intentionally generating valid but erroneous messages. Ones that seem discouraged but not quite outlawed by the XML Schema specifications. And why was this done? It is worth repeating that this is not just for the added precision and customer value, but also in order to comply with other standards.
And it is also worth noting that no interoperability is lost in the process. I know of no implementation that can not parse valid decimal quantities with trailing zeros. True, not all of them retain this precision, but that is just a quality of implementation issue.
True or false? What can be simpler? SOAP and XML Schema permit alternate encoding of boolean values as "0" and "1". While I have yet to find a definitive reference that states which of these values are supposed to be cannonicalized to true and which to false, this seems pretty obvious. [I won't mention that return codes of zero traditionally mean success and non-zero means failure].
Now imagine a simple array of booleans. Not much more complex? Let's look at some of the issues being debated on SOAP builders, recast using this data type.
First, there is the issue of nil. It seems to represent a third (or fifth, depending on how you count) possible value. This seems reminiscent of SQL's null, loosely equating to unspecified or unknown.
Then there is the issue of omitted elements in partially transmitted arrays. From a pure SOAP perspective, this expressed in a manner that is clearly distinguishable from any of the previously mentioned values (including nil).
Then there is the issue of multi-ref. It is quite possible to encode using SOAP an array where the first two true values are clearly the same, but quite distinct from the third value, even though it also happens to be true.
Huh?
Are such uses of SOAP valid? Actually, yes. But do they have a chance of being interoperable? In my opinion - not a prayer. In languages like Java, there is a sensible mapping that could be done, but in a language like Perl these distinctions would be a tad bit artificial.
At least one soap implementation ( SOAP4R) has implemented interoperability tests that check for a distinction such as this one. Per above, I think such tests provide valuable information, but any implementation that relies on this behavior - or any test of this kind rated as a simple pass/fail, is erroneous.
Success for SOAP as a protocol requires satisfying two sometimes-conflicting goals. First, it must specify semantics precisely and clearly so that valid and validateable implementations can be made. Second, these semantics must have a fairly natural mapping to a wide range of the popular languages and platforms that are in existence today. The reason why these goals sometimes conflict is that there is a fair amount of diversity out in the industry.
Despite the impression you may have gotten from the above text, the semantic match to a wide range of languages and platforms is actually very good, with differences only showing up at the 30th decimal place, as it were. Furthermore, the differences represent legitimate manifestations of valid tradeoffs made in implementations that target different markets.
There are a few places in my opinion where the SOAP specification itself could spell out the limits to which implementations should be able to rely on semantics being preserved. But in addition to that, there needs to be an effort to categorize and document each implementation with respect to the types of issues I've described here so that developers can make informed choices when developing their web service and choosing a language or platform.
For the near term, this is not so much an endpoint as a process. And given the outstanding cooperation that has been exhibited so far, I see this progress only continuing to accelerate.
To infinity and beyond, indeed!